Closed Bug 1212404 Opened 9 years ago Closed 9 years ago

Retrigger taskcluster build/test through staging treeherder doesn't work

Categories

(Tree Management :: Treeherder, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: edgar, Unassigned)

Details

After clicking the retrigger icon (or hitting the 'r' key) on a taskcluster build/test, I saw "Retrigger request sent" message popup, but nothing else happens.
It worked a couple days ago (when the retrigger icon didn't become a menu).
:emorley, do you know what's happening here? Thank you.
Flags: needinfo?(emorley)
Could you link to which push/job this occurred on?
If you inspect the request sent in the web console, was it successful and what was the body of the response?

I can't see any errors in New Relic, so presumably Treeherder is sending the pulse notification (that Taskcluster listens to) - Greg, would you have any ideas?
Flags: needinfo?(emorley) → needinfo?(garndt)
As far as the integration between treeherder and taskcluster goes, the retrigger event is a best effort service because it relies on pulse messages and something consuming those to cause a task to be retriggered.

I'm looking at our logs and I'm seeing jobs being retriggered (as recent as 15 minutes ago).  Do you have a task ID?

Looking at the logs from yesterday it appears that there was about a 4 hour window where no retriggers were logged.  Because we rely on being connected to pulse, if that connection drops currently we do not retry connecting so some pulse messages could be lost.  I believe there is some work out there to add connection retrying to our pulse listener.
Flags: needinfo?(garndt)
Here is my try push: https://treeherder.allizom.org/#/jobs?repo=try&revision=dc8492f5b3dc&group_state=expanded

I did few retriggers on some builds and tests on Treeherder UI, like
https://tools.taskcluster.net/task-inspector/#Hf9X-gUSRGqUDDVBDutwmA/
https://tools.taskcluster.net/task-inspector/#b99gBGCFRXO4I2V6YX8C7A/
....

but nothing happened.

Thanks you.
Flags: needinfo?(garndt)
Flags: needinfo?(emorley)
This happened last night as well for someone, but then a few minutes later philor was able to retrigger jobs.  I don't see anywhere in the logs where this might have been an error but I'll keep my eye out.

:emorley, is there anything on the treeherder side that's logged when a retrigger action is performed and confirming the pulse message was submitted?
Flags: needinfo?(garndt)
Okay, I could retrigger tc task through production treeherder.
It seems only staging treeherder doesn't work. I just did some retrigger through staging treeherder again, and still nothing happened.
Summary: Retrigger taskcluster build/test doesn't work → Retrigger taskcluster build/test through staging treeherder doesn't work
right now the work around would be to retrigger on treeherder prod for those jobs that show up on production.  They should run correctly and report the task as completed to prod/staging th depending on task configuration.

Found an issue in the logs on our staging mozilla-taskcluster environment that is preventing retriggers from taking place:
https://papertrailapp.com/systems/mozilla-taskcluster-staging/events?r=588826790659928083-588826809853063176

I'm going to remove the ni? for emorley because this doesn't seem to be a treeherder issue.
Flags: needinfo?(emorley)
I'm hitting the same in bug 1212967
I tried again these days, I could retrigger jobs from staging treeherder, close as WORKSFORME. Thank you.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.