Closed Bug 451968 Opened 17 years ago Closed 17 years ago

Sunbird & Lightning tinderboxen for Linux & Windows have stopped working again [Update of nightly builds broken]

Categories

(Release Engineering :: General, defect, P2)

x86
All

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: tonymec, Assigned: nthomas)

References

()

Details

See http://tinderbox.mozilla.org/showbuilds.cgi?tree=Sunbird&hours=24 and http://tinderbox.mozilla.org/showbuilds.cgi?tree=Sunbird-Mozilla1.8&hours=24 All Sb tboxen for Linux & Windows have stopped working some 14 hours ago. This is a regression: bug 448877 and/or bug 444987 have become "unfixed", so to speak.
cb-vmware01 was rebooted, so all the VMs on it were rebooted.
Keywords: regression
(In reply to comment #1) > cb-vmware01 was rebooted, so all the VMs on it were rebooted. > Well, maybe so, but AFAICT the Sb {Lin|Win} {trunk|Moz1.8} tboxen have _still_ not started building again, after (for some of them) more than 24 hours. Notice how the Windows columns have already disappeared from the pages mentioned in comment #0, and the Linux ones are steadily either grey or yellow for almost their whole height.
Severity: normal → major
Adding Calendar owner & peers to CC (insofar as their addresses of record are known to Bugzilla) in the hope that they'll know whom to ping to restart the tboxen.
oh, and a couple more CC in case I filed this bug in the wrong component.
Tony, some context for you. The Sunbird & Lightning build machines are administered by those projects, in particular by Ause (ause@sun.com). Moco IT (eg mrz) are only on the hook for colo issues and the hosting hardware for the Linux & Windows VMs. They will have notified Ause of the host reboot that Reed mentioned in comment #1 (although MoCo Release Eng didn't get a copy), but he's on vacation at the moment. It's the weekend so that others who might be able to help are thin on the ground, so I suggest you ask in #build on Monday. AFAIK we've never set up tinderbox machines to auto restart the tinderbox process on reboot. Some Firefox boxes have moved in that direction since switching to buildbot, and the same techniques could be used by Ause if he wants to. The real fix here is to figure out why the VM host (cb-vmware01) is having problems. This is the 2nd or 3rd time it's done it; last time it was a hang on Aug 2. Please search for a bug filed on that, and create a new one if you don't find anything (mozilla.org :: Server Ops)
In reply to comment #5 Thanks, Nick, for filling me in on the Calendar context. I CC'ed ause when filing the bug, so I was not _totally_ on the wrong track. I didn't know he's on holiday though. Let's hope whoever is filling in for him will have got bugmail from this bug, either by watching him or from my attempts in comment #3 and comment #4, and that the boxen will be restarted soon after they get on the job on Monday. I'll try to find a bug as you said, but I know I'm notoriously poor at finding already-filed bugs which I haven't yet seen, so there's a possibility that I'll file a duplicate. We'll see.
Filed bug 452026.
CC'ing Kurt who stands in for Ause.
Kurt, could you please restart the tinderboxen while Ause is away?
Summary: Sunbird tinderboxen for Linux & Windows have stopped working again → Sunbird & Lightning tinderboxen for Linux & Windows have stopped working again
Summary: Sunbird & Lightning tinderboxen for Linux & Windows have stopped working again → Sunbird & Lightning tinderboxen for Linux & Windows have stopped working again [Update of nightly builds broken]
Ouch, Kurt is on vacation this week, too. I've heard that Ause is back tomorrow.
(In reply to comment #11) > Ouch, Kurt is on vacation this week, too. I've heard that Ause is back > tomorrow. > Ah, that's why Kurt didn't restart them. If Ause's also sherriffing the Sun boxen (the Sb-moz1.8 ones are all aflame) he'll have work. Well, wait and see.
Assignee: nobody → nthomas
Priority: -- → P2
The four boxes now have the tinderbox process restarted on them, and the first builds have appeared on the tinderbox waterfalls. Since each box is responsible for 3 different builds (Lightning en-US, Sunbird en-US, Sunbird locales) it'll take a while for all the columns to reappear. Reopen if you've waited a few hours and they haven't turned up.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Ah, Nick, thanks a lot. I see that at least one build per box has started being done, the Windows ones being slowest. The Linux Lt-moz1.8 has turned orange which I suppose is a different bug. I'm waiting until all columns reappear before I mark this as VERIFIED but I'm having good hope. :-)
I've tweaked the orange box (lt18-linux-tbox) so that app-launch tests can talk to the X server, it should go green on the next attempt.
OK, here's what I see: http://tinderbox.mozilla.org/showbuilds.cgi?tree=Sunbird One build each (nightlies) of Sb and Lt for Windows Several builds each (the first one being a nightly) of Sb and Lt for Linux http://tinderbox.mozilla.org/showbuilds.cgi?tree=Mozilla-l10n Three Sb-Linux (the first one being a nightly) Sb-WinNT has started building and is still busy http://tinderbox.mozilla.org/showbuilds.cgi?tree=Sunbird-Mozilla1.8 One each (being nightlies) of Sb and Lt for Windows An orange then a nightly of Lt-Linux Two nightlies of Sb-Linux (I guess the usual hour at which nightlies start building happened between these two builds) The completed nightlies are visible on the FTP server. I have downloaded an Sb-Linux-moz1.8 nightly and it works (better than the latest one before the reboot, in fact). Therefore I'm marking this VERIFIED FIXED.
Status: RESOLVED → VERIFIED
Component: Release Engineering: Maintenance → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.