Optimize searchfox taskcluster cron jobs for less frequently updated branches like ESR and release
Categories
(Webtools :: Searchfox, enhancement)
Tracking
(firefox137 fixed)
Tracking | Status | |
---|---|---|
firefox137 | --- | fixed |
People
(Reporter: asuth, Assigned: jcristau)
References
Details
Attachments
(3 files)
Currently the searchfox-index jobs in https://searchfox.org/mozilla-central/source/.cron.yml are run daily for the mozilla-beta, mozilla-release, and mozilla-esr78 branches. This helps ensure that we run the jobs at most once a day even when branches are experiencing a lot of pushes, but also results in us running the jobs at least once a day, even if there have been no pushes. Since the taskcluster searchfox jobs use the in-tree MozsearchIndexer.cpp rather than downloading it from an external location (or anything else from the mozsearch repo), the jobs are effectively deterministic and so extra runs of the job are wasteful beyond making sure we never encounter a situation where artifacts expire.
It would be good to address this inefficiency.
Assignee | ||
Comment 1•1 year ago
|
||
Skip running the indexing tasks if they already ran on the same revision
and the previous decision task expires in over a week.
Updated•1 year ago
|
Comment 3•1 year ago
|
||
bugherder |
Reporter | ||
Comment 4•8 months ago
|
||
My mental model of the expiry logic in the PR here was that we would re-run searchfox tasks if it looks like the artifacts have less than a week on them. But in bug 1930345 we saw that this isn't the case; here's the Nov 11 cron task deciding not to re-trigger the searchfox jobs for which the artifacts had expired.
Looking more closely, it looks like the logic is actually finding some earlier decision-searchfox-index but like the first cron decision task apparently ever run for that revision itself still has an expiry 9 months from now, with the decision task from the other day that decided not to index things still having a ~full year left... I'm not sure I understand which route wins there, but none of them will regenerate the searchfox artifacts (which appear to expire after 3 months).
Should I file a new bug for this / is there a correct idiom we can copy for this purpose? Thanks!
Assignee | ||
Comment 5•8 months ago
|
||
Hmm, there's 2 things here:
- with the change here we check the expiration date of the previous searchfox-index cron task; that can't work, as that task is pretty much always going to be from the previous day. We should look up the last jobs that were actually scheduled.
- indexing tasks like https://firefox-ci-tc.services.mozilla.com/tasks/Znd1gn9cQteWOXqQETsO-A (and their artifacts) expire after a year, like the cron task. But the searchfox-index cron task also schedules a couple of source-test tasks, that do seem to have shorter expiries; are these the ones that caused the issue for cypress?
Assignee | ||
Updated•8 months ago
|
Reporter | ||
Comment 6•8 months ago
•
|
||
(In reply to Julien Cristau [:jcristau] from comment #5)
- indexing tasks like https://firefox-ci-tc.services.mozilla.com/tasks/Znd1gn9cQteWOXqQETsO-A (and their artifacts) expire after a year, like the cron task. But the searchfox-index cron task also schedules a couple of source-test tasks, that do seem to have shorter expiries; are these the ones that caused the issue for cypress?
Ah, yeah, the first specific error reported was bugzilla-components.json and I erroneously assumed everything was expiring, but in fact we had fewer failures than everything failing:
curl: (22) The requested URL returned error: 404
parallel: This job failed:
curl -SsfL --compressed https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/gecko.v2.cypress.revision.3f80fbb5cea5ff9c54b727195e9df32731321411.source.source-bugzilla-info/artifacts/public/components-normalized.json -o bugzilla-components.json || curl -SsfL --compressed https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/gecko.v2.cypress.latest.source.source-bugzilla-info/artifacts/public/components-normalized.json -o bugzilla-components.json
We saw 11 errors reported but only the one specific URL got logged; the fetches we issue are accumulated in fetch-tc-artifacts.sh. I'm attaching an example downloads.lst
file that we build there from a successful run from yesterday :
+ parallel --halt now,fail=1
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
parallel: This job failed:
Reporter | ||
Updated•8 months ago
|
Assignee | ||
Comment 7•6 months ago
|
||
On repos that rarely get pushes, we don't necessarily want to re-run
indexing every day. However we need to make sure artifacts from
downstream tasks remain available for searchfox to download.
The previous approach had two issues:
- the cron task looked at the standard taskgraph index path to find its
previous run. That normally would have been the previous day's task,
so the expiry logic would never kick in - some of the downstream tasks (
searchfox
kind) havemedium
expiration policy, while others (source-test
) use the default, so
assuming that everything relevant would expire at the same time as the
cron task itself was broken
With this change, the searchfox cron task:
- gets indexed at
gecko.v2.{project}.revision.{revision}.searchfox-index only if it does
schedule jobs - looks up the previously indexed task at that location, and checks if
any of the tasks it scheduled are about to expire
Description
•