mw32-ix-slave11 failed to kill its job correctly



6 years ago
10 months ago


(Reporter: bhearsum, Unassigned)



During the Thunderbird 16.0.1 repacks, mw32-ix-slave11 hung when trying to clone from hg due to a network issue. Buildbot eventually timed it out and sent SIGKILL. However, while the master thought the job was dead, the last BuildStep was still running on the slave. This ended up successfully running at almost the same time as a newly started job in Buildbot did. Because of this, the repacks done on the "hung" slave overwrote the ones from the "good" slave mere minutes after they finished. This ended up causing problems during updates, and made us think for a period of time that we may have had an intrusion into our network.

Slaves need to be able to kill processes. I don't know whether this particular slave is busted, or if the platform as a whole is busted.
This might be that we have always been patching the incorrect version of since we switched to use C:\mozilla-build\buildbotve

coop found this when setting up the WinXP slave.

I don't think we currently can stop buildbot on any Windows platform AFAIK.

Notes from coop for WinXP
Deployed correct and ref_image_version.txt
* discovered that we've been patching the version of under the python2.4 dir. I don't think that version is actually used now that we have C:\mozilla-build\buildbotve
* updated instructions:
* cd C:\mozilla-build\buildbotve\Lib\site-packages\twisted\internet
* mv
* C:\mozilla-build\wget\wget.exe

I believe we could adjust the opsi package to fix this easily.
This platform will be dying soon.
Last Resolved: 6 years ago
Resolution: --- → WONTFIX
Product: → Release Engineering


10 months ago
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.