Closed
Bug 510552
Opened 15 years ago
Closed 15 years ago
buildbot should detect hangs better
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: joduinn, Assigned: lsblakk)
References
Details
Currently, buildbot can be configured to timeout and kill jobs that produce no output for a set amount of time. However in cases where the job is hung, yet continues producing a little output, this fools buildbot into thinking job is still running, so doesnt kill it. This ends up with slaves being 100% occupied with never-ending jobs. We need a way to kill jobs after some time - regardless of whether there is output being generated or not. This requires an upstream fix in buildbot. This is a blocker on running unittests on debug builds, which is a Q3 goal.
Updated•15 years ago
|
Summary: buildbot should handle timeouts better → buildbot should detect hangs better
Comment 1•15 years ago
|
||
Moving to future till dependent bugs are fixed.
Component: Release Engineering → Release Engineering: Future
Comment 2•15 years ago
|
||
What? This bug isn't isn't marked as depending on anything.
Component: Release Engineering: Future → Release Engineering
Comment 3•15 years ago
|
||
On https://bugzilla.mozilla.org/show_bug.cgi?id=372581#c73 catlee says: > (In reply to comment #71) > > So mochitest-browser-chrome worked for me on 1.9.1 on a Linux debug build. > > > > That said, from the log, it looks like you hit a known random orange: bug > > 498339. But since you were in a debug build, the infinite loop in question was > > producing output. Is there a way we could make buildbot detect the process as > > hung if |timeout| seconds of output don't contain the string "TEST-", rather > > than checking for no output at all? > > Yeah, I've got an upstream patch to buildbot checked in that sets a maximum run > time for these shell commands, regardless of if output is being generated. > We'll either get this when we upgrade to buildbot 0.7.13, or if we decide to > cherry-pick those changes earlier. > > Having a maximum on the log size sounds like a great idea as well. #### end of what catlee said We will have to wait to see if that patch is already in our buildbot repos
Reporter | ||
Comment 4•15 years ago
|
||
(In reply to comment #3) > On https://bugzilla.mozilla.org/show_bug.cgi?id=372581#c73 > > catlee says: > > (In reply to comment #71) > > > So mochitest-browser-chrome worked for me on 1.9.1 on a Linux debug build. > > > > > > That said, from the log, it looks like you hit a known random orange: bug > > > 498339. But since you were in a debug build, the infinite loop in question was > > > producing output. Is there a way we could make buildbot detect the process as > > > hung if |timeout| seconds of output don't contain the string "TEST-", rather > > > than checking for no output at all? > > > > Yeah, I've got an upstream patch to buildbot checked in that sets a maximum run > > time for these shell commands, regardless of if output is being generated. > > We'll either get this when we upgrade to buildbot 0.7.13, or if we decide to > > cherry-pick those changes earlier. > > > > Having a maximum on the log size sounds like a great idea as well. > #### end of what catlee said > > We will have to wait to see if that patch is already in our buildbot repos bhearsum: can you let us know if this patch is in our repo? It'd be great if we can try this out in staging... Otherwise, I guess we'd need to do a buildbot upgrade ??
Comment 5•15 years ago
|
||
(In reply to comment #4) > bhearsum: can you let us know if this patch is in our repo? It'd be great if we > can try this out in staging... Otherwise, I guess we'd need to do a buildbot > upgrade ?? It's not in our repo yet. We can cherry pick it though, we don't have to do a full upgrade.
Assignee | ||
Comment 6•15 years ago
|
||
User repo with cherry picked patch in it: http://hg.mozilla.org/users/bhearsum_mozilla.com/buildbot/
Assignee: nobody → lsblakk
Status: NEW → ASSIGNED
Comment 7•15 years ago
|
||
This got landed as part of bug 514242
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•