Closed Bug 453071 Opened 16 years ago Closed 16 years ago

mozilla-central unit test builders missing

Categories

(Release Engineering :: General, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: nthomas)

Details

Last builds on the Firefox tree started at 2008/08/30 22:40 PDT, waterfall hangs on "waiting on ..."
Tried restarting the master but it returned an error:
  2008-08-31 16:53:58-0700 [-] error while parsing config file
  2008-08-31 16:53:58-0700 [-] error during loadConfig
...
        --- <exception caught here> ---
          File "/tools/buildbot-trunk/lib/python2.5/site-packages/buildbot/master.py", line 462, in loadTheConfigFile
            self.loadConfig(f)
          File "/tools/buildbot-trunk/lib/python2.5/site-packages/buildbot/master.py", line 480, in loadConfig
            exec f in localDict
          File "/builds/moz2_unittest/master.cfg", line 355, in <module>
            'factory': moz2_win32_unittest_factory2,
        exceptions.NameError: name 'moz2_win32_unittest_factory2' is not defined

Backed up the master to master.cfg.20080831, and commented out the local change:
  firefox_trunk_win2k3_builder1 = {
      'name': "WINNT 5.2 mozilla-central qm-win2k3-unittest-hw dep unit test",
      'slavenames': ['qm-win2k3-unittest-hw'],
      'builddir': "trunk_win2k3_hw",
      'factory': moz2_win32_unittest_factory2,
      'category': "HEAD",
  }
...
  builders.append(firefox_trunk_win2k3_builder1)

There are other local changes there, like renaming the builddir's. I left those in there, which resulted in buildbot saying "builder created" on the restart. Slaves are reconnecting and picking up jobs.
Priority: -- → P1
Ok, they're back but failing the reftest for bug 273681. #developers is on the case, could be bug 407216.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
And it pooped out again, in the not-responding-on-http-and-all-the-slaves-disconnected sense.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
No run-away processes; buildbot is still in the process list, twistd.log ends
2008-08-31 20:00:58-0700 [-] Polling Hg server at http://hg.mozilla.org/mozilla-central/index.cgi/pushlog
2008-08-31 20:01:58-0700 [-] Polling Hg server at http://hg.mozilla.org/mozilla-central/index.cgi/pushlog
Surely that's done asynchronously. Nothing on the VM console except 
  FS-Cache: Loaded
  FS-Cache: netfs 'nfs' registered for caching
I can access /builds/share OK; other filesystems are r/w. Nothing in /var/log/messages.

I'm going to reboot the master after updating the vmware tools.
Gah, same problem. Thought it was bug 453085 (hanging on a pushlog request) but that doesn't seem to be it.
Strike that last comment, there was a poll request out on the pushlog which was causing buildbot to hang. Another restart makes it responsive again. Slaves should reconnect and start building again.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Component: Release Engineering: Maintenance → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.