Linux talos slaves start buildbot twice on reboot

RESOLVED INVALID

Status

RESOLVED INVALID
9 years ago
5 years ago

People

(Reporter: lsblakk, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

When a build is stopped from buildbot waterfall, the reboot that results starts 2 instances of buildbot:

mozqa     5937  0.0  1.6  46776 16600 ?        Sl   16:48   0:00 gnome-terminal -x buildbot start talos-slave
mozqa     6066  1.6  0.6  10888  6984 ?        S    16:49   0:12 /usr/bin/python /usr/bin/buildbot start talos-slave
mozqa     6422  0.0  0.0   2980   768 pts/0    S+   17:01   0:00 grep buildbot

It's not a parent - child relationship, they are two separate processes and it prevents the slave from coming back on the waterfall.
More info:  when it comes up with the two buildbot processes, the /usr/bin/python /usr/bin/buildbot start talos-slave process runs tests:

5979 ?        S      0:40 /usr/bin/python /usr/bin/buildbot start talos-slave
 6002 pts/0    Ss+    0:00  \_ python run_tests.py --noisy 20091112_1712_config.yml
 6071 pts/0    Z+     0:00      \_ [sh] <defunct>


while not appearing connected to the master and then it reboots but this time it never comes back up at all and needs physical intervention.
Next time you see this a 'ps auxwwwwf' might be helpful to track down where the second one is coming from. The 'f' will get things printing out in a tree view, so you can see parent processes.
I did do a ps xf which is why I said in comment 0 that it's not a parent-child relationship, they are two separate instances of buildbot, one started with gnome-terminal (which is the one we don't want, btw).

here's what ps xf showed, I probably should have put this in the original comment: http://pastebin.mozilla.org/683422
This looks like it's closeable. Doesn't seem like it's the source of whatever issues you're hitting.
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → INVALID
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.