Can't add new jobs on treeherder C-C and C-A, as they don't have a decision task

RESOLVED FIXED

Status

RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: jorgk, Assigned: KWierso)

Tracking

Details

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
When I click "Add new jobs" I get "Error fetching runnable jobs" on C-C and C-A.

Comment 1

2 years ago
Moving over to treeherder.

The problem is that thunderbird builds are pure buildbot, AFAICT, and treeherder looks for the decision task, and it doesn't exist. And then it tries to load https://treeherder.mozilla.org/api/project/comm-aurora/runnable_jobs/?decision_task_id= according to the network panel, and that's fubar ;-)
Component: Other → Treeherder
Product: Release Engineering → Tree Management
QA Contact: mshal
Summary: Can't add new jobs on treeherder C-C and C-A → Can't add new jobs on treeherder C-C and C-A, as they don't have a decision task
Version: unspecified → ---
(Assignee)

Comment 2

2 years ago
Hmm, suppose that button should probably be hidden on comm-* trees...
(Reporter)

Comment 3

2 years ago
(In reply to Wes Kocher (:KWierso) from comment #2)
> Hmm, suppose that button should probably be hidden on comm-* trees...
That used to work.
Created attachment 8838884 [details] [review]
[treeherder] KWierso:maybe_fix_runnable_jobs > mozilla:master
(Assignee)

Comment 5

2 years ago
Comment on attachment 8838884 [details] [review]
[treeherder] KWierso:maybe_fix_runnable_jobs > mozilla:master

In that case, maybe this would fix things? Totally untested, but if the problem is in attempting to fetch a decision task id from a tree without decision task ids, I think this might skip over that attempt and let the buildbot jobs still get shown.

Not really sure who could review this while armen's out on leave. Will touched this file recently.
Attachment #8838884 - Flags: review?(wlachance)
(Assignee)

Comment 6

2 years ago
Testing locally in vagrant, the first commit did not fix the error. Adding the second commit fixed the error the request to the runnable_jobs api returns a 200 status), but it says it couldn't find any runnable jobs. I don't know if that's just a quirk of running in Vagrant or maybe I have something misconfigured.
(Assignee)

Updated

2 years ago
Assignee: nobody → wkocher
Comment on attachment 8838884 [details] [review]
[treeherder] KWierso:maybe_fix_runnable_jobs > mozilla:master

See PR for details.
Attachment #8838884 - Flags: review?(wlachance)
(Assignee)

Comment 8

2 years ago
Comment on attachment 8838884 [details] [review]
[treeherder] KWierso:maybe_fix_runnable_jobs > mozilla:master

Tweaked the commit message, and changed the PR to only catch HTTPErrors, and after that re-raise the exception if it isn't a 404 error. I don't know if this will actually get Add New Jobs working again for comm-* trees (testing locally still says "No runnable jobs" found, but it at least eats the error that's happening.

After this deploys, if the feature's still not working, I'm not sure what else to do other than hide the button for comm-* trees until Armen gets back from PTO next month and can look closer.
Attachment #8838884 - Flags: review?(wlachance)
Comment on attachment 8838884 [details] [review]
[treeherder] KWierso:maybe_fix_runnable_jobs > mozilla:master

Tested this on stage and at least the treeherder parts worked ok, however new jobs were not actually added on the comm-central tree. 

Treeherder *claimed* to have sent the pulse message:

https://github.com/mozilla/treeherder/blob/796931e/treeherder/webapp/api/resultset.py#L288

The actual scheduling logic happens in pulse actions:

https://github.com/mozilla/pulse_actions/blob/master/pulse_actions/handlers/treeherder_add_new_jobs.py

I think we should land Wes's patch (after making the last round of changes that :emorley and I requested). The buildbot/pulse actions part will need someone else to look into them. It's probably something simple.
Attachment #8838884 - Flags: review?(wlachance) → review+
(In reply to William Lachance (:wlach) (use needinfo!) from comment #9)
> I think we should land Wes's patch (after making the last round of changes
> that :emorley and I requested). The buildbot/pulse actions part will need
> someone else to look into them. It's probably something simple.

Oh wait, it's probably just that I ran this on stage (which fires events on a pulse exchange that pulse_actions doesn't listen to). If we deployed this to production it should just work.

Comment 11

2 years ago
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/9103411418be8059f6b9a6eff26c7722d0dfd1cb
Bug 1340787 - Fix runnable_jobs API for pushes with no decision task (#2191) r=wlach

This fixes its use for repositories such as comm-*
(Assignee)

Comment 12

2 years ago
Added a logging message in the new failure case and switched it to return a task_id of None instead of an empty string. Everything still worked as well as it ever managed to work in Vagrant.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
(Reporter)

Comment 13

2 years ago
I believe I have added a few jobs after this got fixed, so thanks! Today I tried on comm-beta and it didn't work. I've just tried on comm-central and it hasn't worked there either. No confirmation (and no error message) and no job was triggered. Reopen this bug or open a new one?
Flags: needinfo?(wkocher)
(Assignee)

Comment 14

2 years ago
The request that gets sent out is:
GET /v1/task//artifacts/public%2Ffull-task-graph.json?bewit=<OMITTED> HTTP/1.1
Host: queue.taskcluster.net
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0
Accept: application/json, text/plain, */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://treeherder.mozilla.org/
Origin: https://treeherder.mozilla.org
DNT: 1
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache




The response that comes back is:
HTTP/1.1 404 Not Found
Server: Cowboy
Connection: keep-alive
X-Powered-By: Express
Strict-Transport-Security: max-age=7776000
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT
Access-Control-Request-Method: *
Access-control-allow-headers: X-Requested-With,Content-Type,Authorization,Accept,Origin
X-Content-Type-Options: nosniff
Content-Type: text/html; charset=utf-8
Content-Length: 3131
Date: Thu, 09 Mar 2017 00:02:50 GMT
Via: 1.1 vegur



I think this warrants a new bug.
Flags: needinfo?(wkocher)
(Reporter)

Comment 15

2 years ago
(In reply to Wes Kocher (:KWierso) from comment #14)
> I think this warrants a new bug.
Bug 1345798.
You need to log in before you can comment on or make changes to this bug.