Closed Bug 1883046 Opened 3 months ago Closed 3 months ago

make nightly decision task resilient to balrog releases without completes

Categories

(Release Engineering :: Release Automation: Other, defect)

defect

Tracking

(firefox125 fixed)

RESOLVED FIXED
Tracking Status
firefox125 --- fixed

People

(Reporter: bhearsum, Assigned: bhearsum)

Details

Attachments

(1 file)

As part of https://bugzilla.mozilla.org/show_bug.cgi?id=1882729 we created some nightly release blobs in Balrog that didn't have completes in them. They ended up causing an error when the nightly decision task tried to fetch completes out of them:

[task 2024-02-29T22:00:46.290Z] Traceback (most recent call last):
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/taskcluster/mach_commands.py", line 209, in taskgraph_decision
[task 2024-02-29T22:00:46.290Z]     ret = taskgraph_commands["decision"].func(options)
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/taskcluster/gecko_taskgraph/main.py", line 683, in decision
[task 2024-02-29T22:00:46.290Z]     taskgraph_decision(options)
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/taskcluster/gecko_taskgraph/decision.py", line 186, in taskgraph_decision
[task 2024-02-29T22:00:46.290Z]     set_decision_indexes(decision_task_id, tgg.parameters, tgg.graph_config)
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/third_party/python/taskcluster_taskgraph/taskgraph/generator.py", line 151, in parameters
[task 2024-02-29T22:00:46.290Z]     return self._run_until("parameters")
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/third_party/python/taskcluster_taskgraph/taskgraph/generator.py", line 425, in _run_until
[task 2024-02-29T22:00:46.290Z]     k, v = next(self._run)
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/third_party/python/taskcluster_taskgraph/taskgraph/generator.py", line 264, in _run
[task 2024-02-29T22:00:46.290Z]     parameters = self._parameters(graph_config)
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/taskcluster/gecko_taskgraph/decision.py", line 171, in <lambda>
[task 2024-02-29T22:00:46.290Z]     lambda graph_config: get_decision_parameters(graph_config, options)
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/taskcluster/gecko_taskgraph/decision.py", line 372, in get_decision_parameters
[task 2024-02-29T22:00:46.290Z]     parameters["release_history"] = populate_release_history("Firefox", project)
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/taskcluster/gecko_taskgraph/util/partials.py", line 194, in populate_release_history
[task 2024-02-29T22:00:46.290Z]     return _populate_nightly_history(
[task 2024-02-29T22:00:46.290Z]   File "/builds/worker/checkouts/gecko/taskcluster/gecko_taskgraph/util/partials.py", line 255, in _populate_nightly_history
[task 2024-02-29T22:00:46.290Z]     url = history["platforms"][platform]["locales"][locale]["completes"][0][
[task 2024-02-29T22:00:46.290Z] KeyError: 'completes'

We should be more resilient here, and simply skip them.

This fixes the specific issue we just had with a release only containing partials, but also deals with a potential future problem if buildID is missing. (Less likely...but still possible?)

Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a03780370568
make nightly decision tasks more resilient to unexpected Balrog release data r=releng-reviewers,taskgraph-reviewers,jcristau
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: