During the Thunderbird 16.0.1 repacks, mw32-ix-slave11 hung when trying to clone from hg due to a network issue. Buildbot eventually timed it out and sent SIGKILL. However, while the master thought the job was dead, the last BuildStep was still running on the slave. This ended up successfully running at almost the same time as a newly started job in Buildbot did. Because of this, the repacks done on the "hung" slave overwrote the ones from the "good" slave mere minutes after they finished. This ended up causing problems during updates, and made us think for a period of time that we may have had an intrusion into our network. Slaves need to be able to kill processes. I don't know whether this particular slave is busted, or if the platform as a whole is busted.
This might be that we have always been patching the incorrect version of _dumbwin32proc.py since we switched to use C:\mozilla-build\buildbotve coop found this when setting up the WinXP slave. I don't think we currently can stop buildbot on any Windows platform AFAIK. Notes from coop for WinXP ######################### Deployed correct _dumbwin32proc.py and ref_image_version.txt * discovered that we've been patching the version of _dumbwin32proc.py under the python2.4 dir. I don't think that version is actually used now that we have C:\mozilla-build\buildbotve * updated instructions: * cd C:\mozilla-build\buildbotve\Lib\site-packages\twisted\internet * mv _dumbwin32proc.py _dumbwin32proc.py.bak * C:\mozilla-build\wget\wget.exe http://hg.mozilla.org/build/opsi-package-sources/raw-file/e55c081cb8cf/twisted_dumbwin32proc/CLIENT_DATA/_dumbwin32proc.py I believe we could adjust the opsi package to fix this easily.
This platform will be dying soon.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → WONTFIX
Product: mozilla.org → Release Engineering
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.