Closed Bug 1118068 Opened 11 years ago Closed 9 years ago

Stop making duplicate fetch-missing-push-logs requests

Categories

(Tree Management :: Treeherder: Data Ingestion, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: emorley, Unassigned)

References

Details

In bug 1085682 it was noticed (and I think we even mentioned it at the work week too) that the fetch-missing-push-logs tasks often end up repeating the fetch of the same repo/revision combination. This presumably happens as follows: 1) For whatever reason a push isn't imported through the normal ingestion (500s, races, ...) 2) The scheduled ingestion of builds-{pending,running,builds-4hr} runs, and a job is found that belongs to a push that is unknown to Treeherder. 3) A new fetch-missing-push-logs task is created to ingest that missing push, and the import of the job in question is skipped. 4) Attempts to ingest other jobs with that repo/revision combo are made, before the async fetch-missing-push-logs task completes. These come from either: a) other jobs from the same push, during the ingestion task from step #2. b) subsequent runs of th builds-{pending,running,builds-4hr} task, which find the exact same job from step #2. 5) This results in duplicate entries in the fetch-missing-push-logs queue. 6) When the fetch-missing-push-logs queue is processed, we don't double check to see if the push has since been ingested, and re-fetch json-pushes anyway. As such, we either need to query for the repo/rev combo at step #6 and bail, or else avoid the dupes in the queue in the first place.
Priority: P2 → P3
We're going to remove the fetch missing pushlogs task entirely instead now (bug 1191934).
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.