Closed Bug 532228 Opened 15 years ago Closed 15 years ago

retrying sendchanges can bury buildbot master

Categories

(Release Engineering :: General, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: nthomas)

Details

Attachments

(4 files)

If a buildbot master (ie pm01) gets busy from closely-spaced checkins then the sendchange's for unit test runs can get a very slow response. This causes the build to go into a warning state and confuse developers. 

Since bug 523904 we retry the sendchange, resulting in up to 5 requests for the same build. Some of these requests are merged away, some not. Currently there are 200 jobs waiting on pm01 for win32, about 60 for mac.
r+ from bhearsum on irc
http://hg.mozilla.org/build/buildbotcustom/rev/6762cf582d75

We might back this out once the backlog is cleared.
Attachment #415518 - Flags: review+
Gozer, KaiRo - These changes might cause you some issues.
pm01 & pm02 have been reconfigured with these changes (finished a few minutes ago). The backlog should start to clear out faster now.
Backlog is cleared. Leaving this open to discuss how much of this we want to keep permanently. Your thoughts catlee and bhearsum ?
Severity: blocker → normal
Until we can guarantee that we won't submit the same build twice I think we should leave on merging. With that on, I'd be OK having retries set to 1 or 2, maybe 3 - but given how bad things got yesterday I'm hesitant to crank it up higher.
We're still getting multiple retries on unit test sendchanges because the initial patch was incomplete (thanks for spotting that), and set a long enough timeout that even a 20 minute reconfig on pm will not drop changes.
Attachment #415712 - Flags: review?(bhearsum)
Attachment #415712 - Flags: review?(bhearsum) → review+
Pretty sure this is no longer an issue.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: