Closed Bug 1884364 Opened 2 months ago Closed 2 months ago

reduce json-automationrelevance queries to hg.mozilla.org

Tracking

(firefox126 fixed)

Status:

RESOLVED FIXED

Milestone:

126 Branch

Tracking Flags:

Tracking

Status

firefox126

---

fixed

People

(Reporter: bhearsum, Assigned: ahal)

References

(Regressed 1 open bug)

Details

Attachments

(4 files, 1 obsolete file)

Bug 1884364 - Convert gecko_taskgraph/test/test_decision.py to pytest format, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1884364 - Create a new 'files_changed' parameter, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1884364 - Add an option to \|mach taskgraph\| to force using locally changed files, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1884364 - Re-enable tgdiff task and force local files changed, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1884364 - Use local files_changed in Taskcluster integration tests, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review

bhearsum@mozilla.com (:bhearsum)

Reporter

Description

•

2 months ago

hg.mozilla.org is under great stress lately; we should stop making unnecessary queries to it. This endpoint in particular seems to be queried unconditionally every time we run ./mach taskgraph tasks or any other downstream commands. This means that we are often querying it in places where it is absolutely not necessary. As a specific example, I noticed it being queried when we run release decision tasks which definitely do not need the information it provides.

We make the call in this function, which is memoized, so we will only query it once per process (unless we get an error that we retry for).

That function is called in two places:

get_changed_files, which is used by two optimization strategies, a function only used by tests, and a function that is called by a transform to determine the number of chunks of something to run. Of these, the latter is the only concerning one. Because it is called as part of a transform, it will necessarily be run in virtually any scenario that taskgraph runs in.
set_base_revision_in_tgdiff, which is a transform used by the taskgraph-diff tasks in the source-test kind. This is another place where we hit this endpoint virtually any time that taskgraph runs.

I suggest that we find a way to stop hitting this endpoint as part of transforms. This will avoid it being called in things like release decision tasks and other action tasks, and likely other scenarios as well (such as running taskgraph locally, and probably some other scenarios in automation I'm not considering).

@ahal - I'm needinfo'ing you because you generally have clever ideas about this sort of thing, but also because you're the defacto expert on the taskgraph-diff tasks.

@jmaher - You're the defacto expert on test configuration. Are there ways we can avoid running this transform, or defering it to a later point (like an optimization or morph)? Even if we temporarily made the chunks fixed to reduce hg load, that might be a good thing.

bhearsum@mozilla.com (:bhearsum)

Reporter

Comment 1

•

2 months ago

get_changed_files, which is used by two optimization strategies, a function only used by tests, and a function that is called by a transform to determine the number of chunks of something to run. Of these, the latter is the only concerning one. Because it is called as part of a transform, it will necessarily be run in virtually any scenario that taskgraph runs in.

Correct here: this actually only calls to hg.m.o for non-Try repos. I still think it would be useful to further limit or defer this - but it's not as critical because of this.

bhearsum@mozilla.com (:bhearsum)

Reporter

Comment 2

•

2 months ago

(In reply to bhearsum@mozilla.com (:bhearsum) from comment #0)

set_base_revision_in_tgdiff, which is a transform used by the taskgraph-diff tasks in the source-test kind. This is another place where we hit this endpoint virtually any time that taskgraph runs.

@ahal - I'm needinfo'ing you because you generally have clever ideas about this sort of thing, but also because you're the defacto expert on the taskgraph-diff tasks.

I think I found a good solution to this part. I'm moving it out to https://bugzilla.mozilla.org/show_bug.cgi?id=1884386.

Depends on: 1884386

Andrew Halberstadt [:ahal]

Assignee

Comment 3

•

2 months ago

What if we turned files-changed into a parameter rather than something that needs to be computed during graph generation?

Decision tasks would still need to use json-automationrelevance in decision.py, but we can avoid it in all other cases:

When ./mach taskgraph locally we could rely on local vcs to populate this. E.g, use all draft commits that are ancestors of the current commit.
When running locally but with --parameters task-id=<id> (or similar), you'd get the exact files-changed that the decision task computed as it's just baked into the parameters.
We would hardcode dummy files changed into all the taskcluster/test/params, so the taskgraph-diff task wouldn't compute anything either. Or if we still wanted to use real files changed (which might be necessary for testing optimizations and such), we could have ./mach taskgraph insert this parameter on the fly (either using local VCS or the value in the Decision task's parameters, but never using json-automationrelevance).
Transforms would just use what's in the params and also wouldn't need to compute anything.

After writing this out, this seems like a no-brainer to me! Can you think of anything I might be missing?

bhearsum@mozilla.com (:bhearsum)

Reporter

Comment 4

•

2 months ago

(In reply to bhearsum@mozilla.com (:bhearsum) from comment #0)

That function is called in two places:

get_changed_files, which is used by two optimization strategies, a function only used by tests, and a function that is called by a transform to determine the number of chunks of something to run. Of these, the latter is the only concerning one. Because it is called as part of a transform, it will necessarily be run in virtually any scenario that taskgraph runs in.

@jmaher - You're the defacto expert on test configuration. Are there ways we can avoid running this transform, or defering it to a later point (like an optimization or morph)? Even if we temporarily made the chunks fixed to reduce hg load, that might be a good thing.

Whoops, forgot to needinfo, heh.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 5

•

2 months ago

the two spots:

files-changed.py :: check - this appears to only be used by unittests; I suspect we could live without this, or keep it as this is only a unittest; we should confirm that
perfile.py :: per_file_number_of_chunks - this is specifically for test-verify/test-coverage. The point of this is to find test files that were changed so we can run test-verify/coverage on those. This is why these tasks are only run periodically, when certain files are changed. If we didn't do this here, we wouldn't know the number of chunks (no files == zero chunks) and the tasks wouldn't be scheduled.

I am not sure about moving the logic from perfile.py to somewhere else. Could we live without test-coverage/test-verify- sure we could. It appears for perfile.py stuff this only invokes the files_changed when not on try; maybe we could limit this to run on try and central and skipping autoland?

Other suggestions are welcome.

Flags: needinfo?(jmaher)

Andrew Halberstadt [:ahal]

Assignee

Comment 6

•

2 months ago

Attached file Bug 1884364 - Convert gecko_taskgraph/test/test_decision.py to pytest format, r?#taskgraph-reviewers! — Details

I was having problems getting 'responses' to work with the unittest style tests
and decided it would be easier to just convert it to Pytest.

Andrew Halberstadt [:ahal]

Assignee

Comment 7

•

2 months ago

Attached file Bug 1884364 - Create a new 'files_changed' parameter, r?#taskgraph-reviewers! — Details

We use hg.m.o's json-automationrelevance endpoint for a variety of reasons
such as getting the files changed for optimization purposes, or finding the
base revision for diff purposes. But this endpoint is slow and puts undue load
on hg.mozilla.org if queried too often.

The helper function that fetches this is memoized, so in theory we should only
ever make this request once per graph generation. However, there are still cases
where we request this unnecessarily:

When running ./mach taskgraph locally, we first fetch
json-automationrelevance and then fall back to fetching it locally if the
revision wasn't found. I believe the reason for this is to be able to generate
identical graphs as produced by CI.
When specifying multiple parameters (so graphs are generated in parallel),
the memoize won't cache across processes, so we make the request once per
parameter set.
Any other time we generate tasks outside the context of a Decision task (e.g
./mach try), as there are transforms that call this function.

By turning files_changed into a parameter, we can ensure that this value gets
"frozen" by the Decision task and it will never need to be recomputed. E.g, you
could use -p task-id=<decision id> and you'd still get the files_changed
value that Decision task computed. This means, that for all non-Decision use
cases we can rely on local VCS to give us our changed files.

This should greatly cut back on the number of queries being made to hg.m.o.

Phabricator Automation

Updated

•

2 months ago

Assignee: nobody → ahal

Attachment #9390304 - Attachment description: WIP: Bug 1884364 - Convert gecko_taskgraph/test/test_decision.py to pytest format, r?#taskgraph-reviewers! → Bug 1884364 - Convert gecko_taskgraph/test/test_decision.py to pytest format, r?#taskgraph-reviewers!

Status: NEW → ASSIGNED

Phabricator Automation

Updated

•

2 months ago

Attachment #9390305 - Attachment description: WIP: Bug 1884364 - Create a new 'files_changed' parameter, r?#taskgraph-reviewers! → Bug 1884364 - Create a new 'files_changed' parameter, r?#taskgraph-reviewers!

Andrew Halberstadt [:ahal]

Assignee

Comment 8

•

2 months ago

Attached file Bug 1884364 - Add an option to |mach taskgraph| to force using locally changed files, r?#taskgraph-reviewers! — Details

Andrew Halberstadt [:ahal]

Assignee

Comment 9

•

2 months ago

Attached file Bug 1884364 - Re-enable tgdiff task and force local files changed, r?#taskgraph-reviewers! — Details

Previously we were using json-automationrelevance to compute the base_rev.
But recently we improved the Decision task to be smarter about determining
this, so the value already baked into the parameters should be sufficient.

Note that bug 1884386 already removed the transform that was calling
get_json_automationrelevance. Rather than re-adding it, this patch uses the
task_context transforms to grab the value from parameters instead.

This additionally adds the --force-local-files-changed flag to the command,
as we might miss something if we just used the fake values baked into the test
params.

Julien Cristau [:jcristau]

Updated

•

2 months ago

Comment 10

•

2 months ago

Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0c89ad084b0d
Convert gecko_taskgraph/test/test_decision.py to pytest format, r=taskgraph-reviewers,gbrown
https://hg.mozilla.org/integration/autoland/rev/85241d2d265c
Create a new 'files_changed' parameter, r=taskgraph-reviewers,releng-reviewers,jcristau
https://hg.mozilla.org/integration/autoland/rev/d6232354e9f3
Add an option to |mach taskgraph| to force using locally changed files, r=taskgraph-reviewers,jcristau
https://hg.mozilla.org/integration/autoland/rev/60a6ed6de4ae
Re-enable tgdiff task and force local files changed, r=taskgraph-reviewers,jcristau

Sandor Molnar[:smolnar]

Comment 11

•

2 months ago

Backed out for causing py3 unit test failures

Backout link: https://hg.mozilla.org/integration/autoland/rev/719fcbd9cc2a0730856fd730ef31f17f6c00a005

Push with failures

Failure log

taskcluster/test/test_mach_try_auto.py::test_only_important_manifests PASSED
[task 2024-03-14T16:28:52.831Z] taskcluster/test/test_mach_try_auto.py::test_tasks_are_scheduled[mochitest-browser-chrome] TEST-UNEXPECTED-FAIL
[task 2024-03-14T16:28:52.831Z] taskcluster/test/test_mach_try_auto.py::test_tasks_are_not_scheduled[no shippable builds] PASSED
[task 2024-03-14T16:28:52.831Z] taskcluster/test/test_mach_try_auto.py::test_tasks_are_not_scheduled[no fuzzing builds] PASSED
[task 2024-03-14T16:28:52.831Z] taskcluster/test/test_mach_try_auto.py::test_tasks_are_not_scheduled[no ccov builds] PASSED
[task 2024-03-14T16:28:52.831Z] taskcluster/test/test_mach_try_auto.py::test_tasks_are_not_scheduled[no build-signing] TEST-UNEXPECTED-FAIL
[task 2024-03-14T16:28:52.831Z] taskcluster/test/test_mach_try_auto.py::test_tasks_are_not_scheduled[no upload-symbols] PASSED

Flags: needinfo?(ahal)

Andrew Halberstadt [:ahal]

Assignee

Comment 12

•

2 months ago

Attached file Bug 1884364 - Use local files_changed in Taskcluster integration tests, r?#taskgraph-reviewers! (obsolete) — Details

Rob Lemley [:rjl]

Updated

•

2 months ago

Comment 13

•

2 months ago

Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d3a11f386f93
Convert gecko_taskgraph/test/test_decision.py to pytest format, r=taskgraph-reviewers,gbrown
https://hg.mozilla.org/integration/autoland/rev/6bae790de5de
Create a new 'files_changed' parameter, r=taskgraph-reviewers,releng-reviewers,jcristau
https://hg.mozilla.org/integration/autoland/rev/0c6f696169f0
Add an option to |mach taskgraph| to force using locally changed files, r=taskgraph-reviewers,jcristau
https://hg.mozilla.org/integration/autoland/rev/5ed9438f3dc0
Re-enable tgdiff task and force local files changed, r=taskgraph-reviewers,jcristau
https://hg.mozilla.org/integration/autoland/rev/dc701f4ef6c4
Use local files_changed in Taskcluster integration tests, r=taskgraph-reviewers,bhearsum

Treeherder Bug Filer

Updated

•

2 months ago

Regressions: 1886230

Narcis Beleuzu [:NarcisB]

Comment 14

•

2 months ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/d3a11f386f93
https://hg.mozilla.org/mozilla-central/rev/6bae790de5de
https://hg.mozilla.org/mozilla-central/rev/0c6f696169f0
https://hg.mozilla.org/mozilla-central/rev/5ed9438f3dc0
https://hg.mozilla.org/mozilla-central/rev/dc701f4ef6c4

Status: ASSIGNED → RESOLVED

Closed: 2 months ago

status-firefox126: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 126 Branch

Connor Sheehan [:sheehan]

Comment 15

•

2 months ago

Just wanted to come in here and say thanks for taking this on to help with the load issues on hg.mozilla.org, it is much appreciated.

Phabricator Automation

Updated

•

2 months ago

Attachment #9391776 - Attachment is obsolete: true

Andrew Halberstadt [:ahal]

Assignee

Comment 16

•

2 months ago

Np! Has it made a noticeable difference?

Andrew Halberstadt [:ahal]

Assignee

Updated

•

2 months ago

Flags: needinfo?(ahal)

Andrew Halberstadt [:ahal]

Assignee

Updated

•

1 month ago

Updated

•

2 days ago

Regressions: 1896978

You need to log in before you can comment on or make changes to this bug.

Bug 1884364 - Convert gecko_taskgraph/test/test_decision.py to pytest format, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1884364 - Create a new 'files_changed' parameter, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1884364 - Add an option to \|mach taskgraph\| to force using locally changed files, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1884364 - Re-enable tgdiff task and force local files changed, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1884364 - Use local files_changed in Taskcluster integration tests, r?#taskgraph-reviewers! 2 months ago Andrew Halberstadt [:ahal] 48 bytes, text/x-phabricator-request		Details \| Review