Closed Bug 1175726 Opened 9 years ago Closed 9 years ago

Test jobs not being scheduled...

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: KWierso, Unassigned)

References

Details

When I look at https://secure.pub.build.mozilla.org/buildapi/pending I see there's 33,549 entries in the pending jobs list.

When I look at http://builddata.pub.build.mozilla.org/reports/pending/pending.html I see there's 19431 pending ubuntu64-vm test jobs on fx-team, and 13939 pending winxp-ix test jobs on fx-team.

But I don't actually see any of these pending jobs in treeherder on the revisions that buildapi says has a bunch of pending revisions.


All trees closed.
The pending count has gone up to 44k.

We think this is from http://hg.mozilla.org/build/buildbot-configs/rev/2de50c1598b4. There is a set of schedulers which have -<timeout> suffix. By changing the timeout value, we think an old set of schedulers was re-enabled, and the last code change they saw was on 2015-04-23. There are now processing every code push (change) since then, eg this from the tests scheduler master:

2015-06-17 14:32:23-0700 [-] tests-fx-team-xp-ix-debug-unittest-5-1800: triggering since we have 897/5 important changes

We're stopping the test scheduler master, and will cleanup the scheduler table in the db.
Component: Release Automation → General Automation
QA Contact: bhearsum → catlee
The schedulers are cleaned up, we're squashing the pending now.
Test scheduler is back up to accept sendchanges from builds job that are finishing. Still working on squashing the pending.
Still working on cleaning up the old pending jobs.
25k jobs left to clean up - ETA 15 minutes
most of the backlog has been cleared - looking to see what the fallout is like
releng - please file a bug to follow up on this so these old schedulers get deleted by the maintenance script.
Reopened the trees. We'll see how the backlog goes.
Severity: blocker → normal
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
(In reply to Chris AtLee [:catlee] from comment #8)
> releng - please file a bug to follow up on this so these old schedulers get
> deleted by the maintenance script.

bug 1176132 for this.
See Also: → 1176132
Thanks to all of you who worked on fixed this yesterday. I didn't realize when I made a simple change to reduce the frequency of the SETA skipping that it would cause such scheduling issues. Lesson learned for the future :-)
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.