Closed Bug 1891768 Opened 8 months ago Closed 7 months ago

The taskgraph should use a minimal hgweb endpoint to retreive the list of files modified in a push

Categories

(Release Engineering :: General, enhancement)

enhancement

Tracking

(firefox127 fixed)

RESOLVED FIXED
Tracking Status
firefox127 --- fixed

People

(Reporter: glob, Assigned: ahal)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 1 obsolete file)

gecko_taskgraph/files_changed.py::get_changed_files() calls the json-automationrelevance endpoint on hgweb to fetch metadata about the push.

Ignoring the debugging output, the only information that appears to be consumed is a set of files modified by the push:
https://searchfox.org/mozilla-central/source/taskcluster/gecko_taskgraph/files_changed.py#25

The automationrelevance web command, implemented at https://github.com/mozilla/version-control-tools/blob/6afea0e78226f11b1d55c4805682d16c5dbe051d/hgext/hgmo/__init__.py#L345, returns a significant amount of metadata. Here's an example of the data returned for each commit in a push:

{
  "author": "James Teh <jteh@mozilla.com>",
  "backsoutnodes": [],
  "bugs": [
    {
      "no": "1305428",
      "url": "https://bugzilla.mozilla.org/show_bug.cgi?id=1305428"
    }
  ],
  "date": [
    1710724305,
    0
  ],
  "desc": "Bug 1305428: Don't set ARIA DOM attributes due to accessibility API calls. r=eeejay\n\nPer the spec, with respect to ARIA, \"accessibility APIs operate in one direction only. User agents publish WAI-ARIA information (roles, states, and properties) via an accessibility API, and an AT can acquire that information using the same API. However, the other direction is not supported.\"\nAlthough Firefox has not complied with this part of the spec for many years, this can cause problems for some ARIA widgets which aren't expecting ARIA attributes to be changed by the browser (nor should they, per the spec).\nThis might change one day, but for now, we should align with the spec.\n\nDifferential Revision: https://phabricator.services.mozilla.com/D204559",
  "extra": {
    "branch": "default",
    "moz-landing-system": "lando"
  },
  "files": [
    "accessible/generic/LocalAccessible.cpp",
    "accessible/tests/mochitest/focus/test_takeFocus.html"
  ],
  "landingsystem": "lando",
  "node": "469d4a744966c133ae480dc66643d47b77aa3209",
  "parents": [
    "488c57bb8ceed6da3f0be935156a480cdb92bc32"
  ],
  "perfherderurl": "https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-beta&originalRevision=54ad040f8a47539f1e5153aab3ed873341696280&newProject=mozilla-beta&newRevision=469d4a744966c133ae480dc66643d47b77aa3209",
  "phase": "public",
  "pushdate": [
    1713167971,
    0
  ],
  "pushhead": "54ad040f8a47539f1e5153aab3ed873341696280",
  "pushid": 19189,
  "pushuser": "ffxbld-merge",
  "rev": 775127,
  "reviewers": [
    {
      "name": "eeejay",
      "revset": "reviewer(eeejay)"
    }
  ],
  "treeherderrepo": "mozilla-beta",
  "treeherderrepourl": "https://treeherder.mozilla.org/jobs?repo=mozilla-beta"
}

There's been some massive pushes recently, such as 54ad040f8a47539f1e5153aab3ed873341696280 which contains 35,324 commits, resulting in a automationrelevance response that is expensive to generate server-side, expensive to consume and parse client-side, and likely to cause timeouts due to how long generation can take.

Specifically the automationrelevance response for 54ad040f8a47539f1e5153aab3ed873341696280, which I won't link to here for obvious reasons, is over 80MB of minified JSON in size and takes more than 90 seconds to generate.

It would be better to either identify or author a hg web command that only returns the data that taskgraph needs; just the set of files in this case. This will reduce the size of the minified JSON down to 10MB, and is very likely to be quicker to generate server-side.

Hey glob, I totally agree that a smaller API that only returns files changed would be a big win!

IIRC we looked into better APIs in the past, but there's nothing available. Which makes sense because core Mercurial has no concept of a "push". It's possible there's an API bundled with the pushloghtml extension we're missing, though in my experience requests to that extension aren't exactly super performant either.

If you know of a better endpoint we can use, we'd be happy to switch. If we need to first implement said endpoint, this bug likely belongs in Core Services :: Mercurial: hg.mozilla.org for now. Or perhaps we can file a new one there that blocks this.

Flags: needinfo?(glob)

Thanks ahal; I was mostly looking for confirmation that my reading of taskgraph's requirements are accurate. I'll file a new bug in the hg component that blocks this one to track adding the endpoint, leaving this bug to track updating taskgraph to use said endpoint.

FWIW the implementation of this will be trivial; mostly duplication of automationrelevancewebcommand https://github.com/mozilla/version-control-tools/blob/6afea0e78226f11b1d55c4805682d16c5dbe051d/hgext/hgmo/__init__.py#L345 with a whole lot of lines removed. I strongly suspect it'll take longer to write and/or run the tests.

Flags: needinfo?(glob)
Depends on: 1892039
See Also: → 1884364
Assignee: nobody → sheehan

@sheehan did you mean to take bug 1892039 instead? :)

Comment on attachment 9398427 [details]
hgmo: add a pushchangedfiles webcommand (Bug 1891768) r?glob,ahal

Revision D208521 was moved to bug 1892039. Setting attachment 9398427 [details] to obsolete.

Attachment #9398427 - Attachment is obsolete: true

(In reply to Julien Cristau [:jcristau] from comment #3)

@sheehan did you mean to take bug 1892039 instead? :)

Yes, I did. :)

Assignee: sheehan → nobody
Assignee: nobody → ahal
Status: NEW → ASSIGNED
Pushed by ahalberstadt@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7ed470dd2a13 Upgrade taskcluster-taskgraph vendor to 8.0.1, r=taskgraph-reviewers,mach-reviewers,bhearsum https://hg.mozilla.org/integration/autoland/rev/e5527e94df8b [ci] Swap out 'json-automationrelevance' for new 'json-pushchangedfiles' endpoint, r=sheehan,taskgraph-reviewers,bhearsum
Pushed by smolnar@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b999fc36dea4 Update 'gecko_taskgraph/test/test_actions_util' for functools.lru_cache memoization, r?#taskgraph-reviewers! CLOSED TREE
See Also: → 1895807
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: