problems after: "WINNT 5.2 mozilla-central moz2-win32-slave08 dep unit test" needs process killed

RESOLVED FIXED

Status

P1
blocker
RESOLVED FIXED
10 years ago
5 years ago

People

(Reporter: dbaron, Assigned: joduinn)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

"WINNT 5.2 mozilla-central moz2-win32-slave08 dep unit test" tinderbox is red because it needs a process (probably either firefox.exe or xpcshell.exe) killed.  (The process is holding open files that the build process tries, and fails, to remove.)
None of the passwords we have on file for cltbld, buildbot, mozqa, or root/administrator for any of build or qa seem to work for logging into this machine.
Assignee: server-ops → nobody
Component: Server Operations: Tinderbox Maintenance → Release Engineering: Maintenance
QA Contact: mrz → release
Assignee: nobody → joduinn
Priority: -- → P1
responding to page, and taking bug.
1) machine password was temp reset on Thursday so dietrich and sdwilsh could do some debugging on that machine. Once they have finished debugging, we will reset the password to usual passwords on file with IT. Sorry for the snafu, we should have let IT-oncall know of this experiment-in-progress.

2) I couldnt find any extra processes running on this machine, but rebooted it anyway, just in case. 

3) I've now triggered a new unittest run on moz2-win32-slave08, to confirm all is ok. Please reopen this bug if there is still a problem.
OS: Other → Windows Server 2003
Status: NEW → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
This box subequently had a bunch of hg exceptions, which strangely didn't show up as bustage on the tinderbox waterfall, and only on builds there were changes (the periodic scheduler. Since we most likely have an older mercurial on there, it isn't coping with the two heads since the 3.1b1 release branch was added. I've clobbered the box, because that's the very easiest way to start again.
er ... (the periodic scheduler was fine).
Um weird, the clobber didn't fix it. I've verified that
* the problem started immediately after the reboot of moz2-win2-slave08 on 2008-10-11
* the buildbot slave got started the same way as moz2-win32-slave07, which doesn't have this problem
* the problem is that a no-change build can update fine, but any build with a change will report an error to the buildbot waterfall, and not-report to the tinderbox waterfall to leave us with a perma-yellow build box.

We go
  C:\WINDOWS\system32\cmd.exe /c d:\mercurial\hg.EXE pull --verbose
and get this stdio
  pulling from http://hg.mozilla.org/mozilla-central
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 5 changes to 5 files
  (run 'hg update' to get a working copy)
If you open the stdio log for this it never finishes loading, which is very odd given it has purple "exception" status.

Then python call-stack for the error is:
Traceback (most recent call last):
  File "/tools/twisted-2.4.0/lib/python2.5/site-packages/twisted/spread/pb.py", line 884, in _recvMessage
    netResult = object.remoteMessageReceived(self, message, netArgs, netKw)
  File "/tools/twisted-2.4.0/lib/python2.5/site-packages/twisted/spread/flavors.py", line 119, in remoteMessageReceived
    state = method(*args, **kw)
  File "/tools/buildbot-077/lib/python2.5/site-packages/buildbot/process/buildstep.py", line 165, in remote_update
    self._finished(Failure())
  File "/tools/buildbot-077/lib/python2.5/site-packages/buildbot/process/buildstep.py", line 198, in _finished
    d = defer.maybeDeferred(self.remoteComplete, failure)
--- <exception caught here> ---
  File "/tools/twisted-2.4.0/lib/python2.5/site-packages/twisted/internet/defer.py", line 107, in maybeDeferred
    result = f(*args, **kw)
  File "/tools/buildbot-077/lib/python2.5/site-packages/buildbot/process/buildstep.py", line 357, in remoteComplete
    loog.finish()
  File "/tools/buildbot-077/lib/python2.5/site-packages/buildbot/status/builder.py", line 402, in finish
    self.merge()
  File "/tools/buildbot-077/lib/python2.5/site-packages/buildbot/status/builder.py", line 365, in merge
    text = "".join([c[1] for c in self.runEntries])
<type 'exceptions.UnicodeDecodeError'>: 'ascii' codec can't decode byte 0x92 in position 153: ordinal not in range(128)

This may be more informative:
 http://production-master.build.mozilla.org:2010/builders/WINNT%205.2%20mozilla-central%20moz2-win32-slave08%20dep%20unit%20test/builds/877/steps/hg/logs/err.html

Any ideas Ben ? Complete guess, deferred connections still hanging around on the master ?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: "WINNT 5.2 mozilla-central moz2-win32-slave08 dep unit test" needs process killed → problems after: "WINNT 5.2 mozilla-central moz2-win32-slave08 dep unit test" needs process killed
Did a restart and clobber, it's green now.  Should we leave this bug open in case the issue comes up again?
I suggest we wait a few hours before closing.
Fine now.
Status: REOPENED → RESOLVED
Last Resolved: 10 years ago10 years ago
Resolution: --- → FIXED

Updated

10 years ago
Component: Release Engineering: Maintenance → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.