Closed Bug 1075799 Opened 10 years ago Closed 10 years ago

Treeherder production job ingestion is getting backlogged

Categories

(Tree Management :: Treeherder, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jfkthame, Unassigned)

References

Details

I pushed a try job recently (about 1 1/2 hours ago), and just loaded up both Treeherder and TBPL to see how it's going. According to Treeherder, there are 4 completed jobs (out of a total of 65) at this time. However, TBPL's view of the same try push shows me over 30 completed jobs; many of these claim to be still running but "overdue" in Treeherder. TBPL also shows me lots of running (gray) jobs that have not appeared at all in Treeherder. At this point, then, it looks like TBPL is much better at giving me an up-to-date view of how my jobs are progressing.
On top of this, TBPL is unable to star any failures with "Undefined" bugzilla errors. All trees closed because we're effectively unable to sheriff the trees right now.
Severity: normal → blocker
Things were fixed for awhile, but now I'm seeing missing pushes on Aurora yet again. Not sure what the status is with other trees, but I wouldn't be surprised to hear that they're in the same boat.
(This bug is about job result ingestion, see bug 1076750 for issues with pushlog ingestion). Part of the problem here was due to the various server restarts for the security updates. However we can still do better. Adding some dep bugs for things that should fix this issue, but we should probably also check that there aren't other optimisations that can be made in the processing time required for each job result etc, how frequently we poll builds-4hr/builds-running/builds-pending etc.
Depends on: 1076776, 1059325, 1076774
Priority: -- → P1
Summary: Treeherder seems substantially slower to show results than TBPL → Treeherder seems substantially slower to show job results than TBPL
Blocks: 1073015
Summary: Treeherder seems substantially slower to show job results than TBPL → [Meta] Improve the time taken for pending/running/completed jobs to appear
Marking this meta as a P2, since the deps in need of fixing first are already P1s.
Priority: P1 → P2
Apparently 1 month is all it takes for me to forget about a bug and file a kinda-dupe of it (bug 1096863). Let's morph this bug to be about the original issue, which was ~"job ingestion getting backlogged" (which is now fixed due to additional VMs being spun up & segregating different tasks etc - see deps) and leave bug 1096863 to be about the longer term goal of making jobs appear sooner by reducing end-to-end processing time.
No longer blocks: 1073015
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Summary: [Meta] Improve the time taken for pending/running/completed jobs to appear → Treeherder production job ingestion is getting backlogged
You need to log in before you can comment on or make changes to this bug.