Closed Bug 1883013 Opened 1 year ago Closed 1 year ago

mach try fuzzy push fails because task has expired

Categories

(Developer Infrastructure :: Try, defect)

defect

Tracking

(firefox125 fixed)

RESOLVED FIXED
Tracking Status
firefox125 --- fixed

People

(Reporter: padenot, Assigned: ahal, Mentored)

References

Details

Attachments

(1 file)

STR:

  • Run mach try fuzzy --preset media-full

Expected:

  • My tests run

Actual:

According to :jcristau, this is because I did this push on the first of march, and the nofis variant have expired, but looking at my shell output, the task list cache wasn't invalidated.

There are a number of classic solutions for this classic problem of cache invalidation:

  • push the query to the D task, that resolves it, when using a preset, this also translates to numerous seconds gained to the developer since computing the task list is very inefficient in itself
  • properly invalidate the cache by checking a valid source of truth instead of the local state (which might be outdated)
  • Don't error out when that's the case, and emit a warning instead (orange vs. red)

To fix the invalidation logic, we would read all expiration keys in:
https://searchfox.org/mozilla-central/source/taskcluster/ci/test/variants.yml

Then if the current date is greater than any of them, invalidate the cache. This is done here:
https://searchfox.org/mozilla-central/rev/6bc0f370cc459bf79e1330ef74b21009a9848c91/tools/tryselect/tasks.py#37

In principle the idea of having the Decision task resolve the preset is a good one. But in practice this would be crossing some separation boundaries. Namely that handling the preset is done by the tryselect module, which currently has nothing to do with the Decision task. I suppose we could have the Decision task call into tryselect, but this would create a bit of a dependency soup. I'm not sure it would be worth it for this use case.

Mentor: ahal

Besides, some presets are saved in a user's local state dir, so the Decision task wouldn't be able to access them anyway. We could check these in like we do the try_task_config.json file, but it would be even more complicated. I vote for simply fixing the invalidation.

I took a peak at option 3 and ignoring tasks that don't exist is easy and will solve other invalidation errors as well. So let's just do that. We can still fix the invalidation logic in addition if we want, but probably not worth it tbh as the local cache would likely be invalidated next time you pull anyway.

Assignee: nobody → ahal
Status: NEW → ASSIGNED

This typically shouldn't happen because both the try push and Decision task are
generating the graph based off the same revision. But one scenario it's
possible is if the local graph was loaded from cache and there are invalidation
bugs.

A known invalidation bug can happen when a variant expires in-between when a
cached graph was saved and loaded.

See Also: → 1882812
Severity: -- → S3
Pushed by ahalberstadt@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/492aaafda07a [try] Log warning rather than fail if there are requested tasks that don't exist in full task graph, r=taskgraph-reviewers,jcristau
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: