Closed Bug 591074 Opened 14 years ago Closed 14 years ago

Tryserver skipping pushes, building them all at once, then canceling builds.

Categories

(Release Engineering :: General, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: froystig, Assigned: lsblakk)

Details

Attachments

(3 files)

There are currently about six push entries (see http://tests.themasta.com/tinderboxpushlog/?tree=MozillaTry) that haven't triggered any builds.  Then, Patrick Walton's push of changeset 8381fb9a63e9 triggered a surprisingly large number of builds under his name only, which have now gradually been disappearing.

He tells me that he's presently receiving many emails of all sorts of failures.
Try is currently under massive load (I think Aravind said 102) due to 8? simultaneous pushes. 8 * 14 = 112 simultaneous full clones of the try repository, which isn't the smallest repository ever.

We're currently looking at this.
I bet all of that is fallout from the master dying, then starting many many jobs when coming back up and overwhelming hg.m.o. 

Try is closed right now.
Assignee: nobody → nrthomas
Priority: -- → P1
Thanks guys for looking into this right away. We are anxiously awaiting the return of our beloved try server :)
Have the builds actually been getting cancelled?  Or just associated with the wrong user on tbpl?
42 clones is three pushes doing all 14 jobs, and is also a cool number. This affects try clones only because they're directed to a particular host, not things like m-c or builds/tools.
Attachment #469676 - Flags: review?(lsblakk)
Attached image Staging result
This is what happened in staging with maxCount=1. Both builds started at the same time, the leak test builds got to checkout first and the opt compile waited for it's turn.
Comment on attachment 469676 [details] [diff] [review]
Clamp number of clones to 42 (try only)

saw it in staging, love it, 42 is the best number.
Attachment #469676 - Flags: review?(lsblakk) → review+
buildbot master reconfig'd. Working on rerunning builds that were affected earlier.
Over to Lukas for the sendchanges to redo earlier failed builds.
Assignee: nrthomas → lsblakk
Sendchange's are working, advised those people by email no need to repush. Try reopened.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: