Closed
Bug 631854
Opened 14 years ago
Closed 14 years ago
Buildbot can't kill subprocesses with usePTY=False, but usePTY=True causes other problems
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dustin, Assigned: dustin)
References
()
Details
Attachments
(3 files)
14.52 KB,
patch
|
Details | Diff | Splinter Review | |
12.41 KB,
patch
|
catlee
:
review+
|
Details | Diff | Splinter Review |
11.34 KB,
patch
|
catlee
:
review+
|
Details | Diff | Splinter Review |
---- usePTY=False leads to situations where the test code forks, and the parent process exits, but the child processes keep running for too long. Buildbot knows the processes are still running, because it still has open pipes, but it has no way to reach those processes to kill them, since they are not isolated to a process group.
---- usePTY=True leads to things like
argv: ['rm', '-rf', 'tools']
environment:
CVS_RSH=ssh
...
_=/home/cltbld/bin/python
closing stdin
using PTY: True
process killed by signal 1
I thought 'closing stdin' was the problem, but I can actually replicate this failure with 'keeping stdin open', so I'm not sure what the problem is.
====
Bottom line is, we never need an actual PTY (that's only useful for tests that need to output to a terminal for some reason). We *do* need to be able to kill process groups, though. So this will involve merging the result of http://trac.buildbot.net/ticket/1796 to the slaves branch, and then disabling usePTY everywhere.
Assignee | ||
Comment 1•14 years ago
|
||
The relevant fixes in upstream
https://github.com/buildbot/buildbot/compare/a8a5c8ba4d98a8aba186a615109ba6b91ee3558b...e15751c301cff0ada7bd116449012c17d633ba09
do not apply at all cleanly against 0.8.0 - the slave code has been significantly refactored since that time. So this patch is not really backport-able; rather, it can be re-implemented for 0.8.0, which increases the risk factor significantly.
I'd like some feedback on deciding whether to push forward with this re-implementation, or try to deploy 0.8.3+patch instead. Risks with 0.8.3+patch are:
1. 'buildbot start' becomes 'buildslave start'
2. requires sniffing out and forward-porting any local patches that were not upstreamed
(1) is mitigated because runslave.py uses twistd directly, and I'd like to deploy runslave.py simuiltaneously with deploying the upgraded buildslave.
From a quick diff between BUILDBOT_0_8_0 and SLAVE_0_8_0, it looks like (2) is only the addition of the 'setMozillaBuildProperties' command, which should be easy (but not trivial) to forward-port.
On the basis of the above, I'm leaning toward upgrading to 0.8.3. Opinions?
Comment 2•14 years ago
|
||
Not applying cleanly vs 0.8.0 makes me lean towards 0.8.3, assuming that we think 0.8.3 is good for us. A p.o.c. run per platform would definitely make that clear.
(In reply to comment #1)
> 1. 'buildbot start' becomes 'buildslave start'
This is actually an opportunity, imo. We can create a buildbot wrapper that checks for ssh-ness on OSX and kills it, otherwise passes off "$@" to buildslave.
Comment 3•14 years ago
|
||
+1 to upgrading to 0.8.3. We're going to have to do it at some point!
The pain is only incrementally worse because of the buildbot->buildslave rename, and if that's really onerous, I'm sure we could create a 'buildbot' wrapper to do what we want.
Assignee | ||
Comment 4•14 years ago
|
||
(In reply to comment #2)
> This is actually an opportunity, imo. We can create a buildbot wrapper that
> checks for ssh-ness on OSX and kills it, otherwise passes off "$@" to
> buildslave.
Actually, I would like to build a buildbot script that says
import sys
print >>sys.stderr, "I think you meant 'sudo reboot' or 'shutdown /r /f /t 0'"
sys.exit(1)
We should not be in the habit of starting slaves without a reboot.
So it looks like it's an upgrade - I'll adjust bug 631851 appropriately.
Assignee | ||
Comment 5•14 years ago
|
||
This is the diff between BUILDBOT_0_8_8 and SLAVE_0_8_0 looks like.
Assignee: nobody → dustin
Assignee | ||
Comment 6•14 years ago
|
||
And these are the new diffs after the upgrade - from the tip of 'slaves' to the tip of 'upstream' (which is at v0.8.4-pre-150-g7175a0d)
From my read, we don't use this extra slave command anymore. If you can confirm that, I'll adjust the tip of 'slaves' to be equal to the tip of 'upstream'.
This will need to be staged for quite a while before deployment to work out any kinks in the killing code. Other than those changes, the code in v0.8.4-pre-150-g7175a0d is very similar to that in v0.8.1 through v0.8.3, so it's pretty well-tested by others.
Attachment #510852 -
Flags: review?(catlee)
Comment 7•14 years ago
|
||
Comment on attachment 510852 [details] [diff] [review]
m631854-buildbot-new-from-upstream.patch
I don't think we use this command either, but you may want to check with community projects. I think they were the last users of it.
Attachment #510852 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 8•14 years ago
|
||
Newly copied folks: my plan is to commit attachment 510852 [details] [diff] [review] as-is, which keeps the setMozillaBuildProperties command intact. However, if we can all agree that either:
1. you don't intend to use the build/buildbot 'slaves' branch for buildslaves; or
2. you don't need this slave-side command anymore
then it'd be nice to get rid of this change and just run a pure upstream implementation of the buildslave code. Let me know what you think. I'm off to run this patch in staging for a while.
Comment 9•14 years ago
|
||
Calendar seems to be the only project using setMozillaBuildProperties. We are currently using buildbot 0.7 and we don't plan on rebuilding our slaves without upgrading to buildbot 0.8 and revamping the whole config to match firefox/seamonkey/thunderbird.
Therefore we don't seem to be affected by this change.
Assignee | ||
Comment 10•14 years ago
|
||
With that in mind, here's a further commit that removes the mozilla-specific stuff from the slaves branch (including my failure to delete buildbot/slave/commands/mozilla.py).
With this change in place, the slaves branch is identical (modulo .hgtags) to the BUILDBOT_0_8_4_PRE_150_G715A0D tag.
I'll have another patch shortly to push this out with puppet.
Attachment #511221 -
Flags: review?(catlee)
Updated•14 years ago
|
Attachment #511221 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 11•14 years ago
|
||
attachment 511221 [details] [diff] [review] is landed (on the slaves branch)
Comment 12•14 years ago
|
||
(In reply to comment #8)
> Newly copied folks: my plan is to commit attachment 510852 [details] [diff] [review] as-is, which keeps
> the setMozillaBuildProperties command intact. However, if we can all agree
> that either:
>
> 2. you don't need this slave-side command anymore
>
As SeaMonkey release engineer, I agree with this.
Dustin, its also defined/specified in buildbotcustom (steps/misc.py) as SetMozillaBuildProperties which is also unused (and in fact its THIS that calendar uses, though a skim seems to implicate it uses buildbot code we'll be losing)
for posterity:
* SeaMonkey has an import |from buildbotcustom.steps.misc import SetMozillaBuildProperties| in buildbot-configs\seamonkey\ccfactory.py)
* Thunderbird has one in |from buildbotcustom.steps.misc import SetMozillaBuildProperties| in buildbot-configs\thunderbird\master.py
* Neither SeaMonkey nor Thunderbird makes use of this import
Assignee | ||
Comment 13•14 years ago
|
||
The puppet and runslave.py changes will happen in bug 631851.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•