Closed Bug 1513480 Opened 5 years ago Closed 5 years ago

Stop creating gzipped version of runnable-jobs.json once bug 1494750 deployed

Categories

(Firefox Build System :: Task Configuration, task)

task
Not set
normal

Tracking

(firefox-esr60 fixed, firefox66 fixed)

RESOLVED FIXED
mozilla66
Tracking Status
firefox-esr60 --- fixed
firefox66 --- fixed

People

(Reporter: emorley, Assigned: dustin)

References

Details

Attachments

(1 file)

In bug 1423215 task configuration changes were made such that the runnable jobs file was now output in two formats, the original runnable-jobs.json.gz and a new runnable-jobs.json, to allow for easy transition to the non-gzipped version. These changes were also up

Then in bug 1494750 support was added to Treeherder for consuming the non-gzipped version. Once the changes in that bug are deployed to Treeherder production (and a few days passed to ensure it's not being backed out for any reason), the taskcluster configs can be adjusted to no longer output the old file:
https://hg.mozilla.org/mozilla-central/rev/88d304c633b6#l1.14

(It will also be necessary to check that there are no other consumers of the file apart from Treeherder)
(In reply to Ed Morley [:emorley] from comment #0)
> These changes were also up

These changes were also uplifted to release/ESR.
Bug 1494750 was deployed some time ago, so this bug can now proceed.
> (It will also be necessary to check that there are no other consumers of the file apart from Treeherder)

This part concerns me -- I have no idea how to check that, but the file has existed for eons, so it doesn't seem unlikely.
I can't find any remaining usages via:
https://github.com/search?q=runnable-jobs.json.gz&type=Code
https://dxr.mozilla.org/mozilla-central/search?q=runnable-jobs.json.gz&redirect=true
https://dxr.mozilla.org/hgcustom_version-control-tools/search?q=runnable-jobs.json.gz&redirect=false

There is one remaining docs reference that will need updating as part of this bug however:
https://dxr.mozilla.org/mozilla-central/rev/c2593a3058afdfeaac5c990e18794ee8257afe99/taskcluster/docs/taskgraph.rst#152

Checking queue's access logs using Papertrail and filtering just on GETs (given the POSTs are the uploads), there are still requests being made however:
https://papertrailapp.com/systems/taskcluster-queue/events?q=runnable-jobs.json.gz%20method%3DGET

Examples from the last 24 hours:
/v1/task/FLLK-_guS9uk-cdbuP3GMA/artifacts/public/runnable-jobs.json.gz
 -> https://tools.taskcluster.net/groups/FLLK-_guS9uk-cdbuP3GMA/tasks/FLLK-_guS9uk-cdbuP3GMA/details
/v1/task/EF4gaww6RtOhebbZqNU2bA/artifacts/public/runnable-jobs.json.gz
 -> https://tools.taskcluster.net/groups/EF4gaww6RtOhebbZqNU2bA/tasks/EF4gaww6RtOhebbZqNU2bA/details
/v1/task/XTAADI9bRG6yw78cqYmsyw/artifacts/public/runnable-jobs.json.gz
 -> https://tools.taskcluster.net/groups/XTAADI9bRG6yw78cqYmsyw/tasks/XTAADI9bRG6yw78cqYmsyw/details
/v1/task/NN1gLDoVS0SqxxgdqWhMgQ/artifacts/public/runnable-jobs.json.gz
/v1/task/HQQgyBSMRgmO08ashB0ZwQ/artifacts/public/runnable-jobs.json.gz
/v1/task/Zz7Lwm-BSxeovqyOeIGGPQ/artifacts/public/runnable-jobs.json.gz

For the half of the above whose IPs I spot-checked, they all originated from us-west-2.compute.amazonaws.com.
Sadly, most of the Internet is in us-west-2 these days!  I spot checked a few IP's plucked from fresh logs, and they are not TC instances.

I wonder if those requests are from scriptworker, validating the content matches its signature?  Johan, does CoT look at all artifacts on a decision task?
Flags: needinfo?(jlorenzo)
For sure it doesn't look at runnable-jobs.json: https://github.com/mozilla-releng/scriptworker/search?q=runnable-jobs&type=Code
Flags: needinfo?(jlorenzo)
On IRC, Dustin made a good call: even though runnable-jobs.json is not listed in scriptworker, the latter may download all artifacts of the decision task. I don't remember this behavior.

That said, XTAADI9bRG6yw78cqYmsyw[1] is a task mentioned in comment 4. This graph doesn't have any scriptworker jobs. Moreover, there has been no subgraph inheriting from XTAADI9bRG6yw78cqYmsyw[2]. Therefore, I don't think Chain of Trust is the origin of these requests. 

[1] https://tools.taskcluster.net/groups/XTAADI9bRG6yw78cqYmsyw
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=348eb320f9d74e9c6d06cb602738ba15d4a146fd
I'm sort of out of ideas then.  We could just stop generating these and see what breaks?
& thanks for looking Johan!
Catlee was also doing some tests to see artifact size via head requests, iirc
(In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #5)
> Sadly, most of the Internet is in us-west-2 these days!

Heroku's common runtime is us-east, so it at least means these came from something other than Treeherder/the Taskcluster services that run on Heroku.
I'd say let's try turning these off on mozilla-central and seeing if anything breaks.
There are zero code hits and it's a one-liner to restore.
Assignee: nobody → dustin
Pushed by dmitchell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5b35ab00691e
stop writing out deprecated runnable-jobs.json.gz r=emorley
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: