Closed Bug 411341 Opened 12 years ago Closed 12 years ago

mothball 1.8.0 Firefox and Thunderbird build machines

Categories

(Release Engineering :: General, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: joduinn, Assigned: joduinn)

References

Details

Attachments

(2 files)

Now that we have EOL'd both Firefox1.5 and Thunderbird1.5 products, we should take offline and mothball these build machines.

Need to see if anyone is relying on these, first.
Priority: -- → P3
If most of these are VMs (correct me if I'm wrong on that), can't we just move them to the community VM server and turn off notifications for them? If someone complains when one dies, we'll know they were using them, otherwise we can turn them off after a set interval (say, 3 months or so?).

(Side note: Do we have someone at Debian or other Linux vendors that might be able to speak to this?)
(In reply to comment #1)
> If most of these are VMs (correct me if I'm wrong on that), can't we just move
> them to the community VM server and turn off notifications for them? If someone
> complains when one dies, we'll know they were using them, otherwise we can turn
> them off after a set interval (say, 3 months or so?).
From http://tinderbox.mozilla.org/showbuilds.cgi?tree=Mozilla1.8.0 it looks like the list of machines to be mothballed are:

crazyhorse (VM)
moz180-linux-tbox (VM)
moz180-win32-tbox (VM)
tb180-win32-tbox (VM)
xserve04 (physical hardware; rare PPC xserve machine)

Moving from internal build network to external community requires scrubbing keys, resetting logins, resetting IPs, with advertised downtimes etc. Doing this work makes sense and is fine, if we know someone in community is relying on these builds, and we decide not to keep supporting the machines internally. However, I dont understand doing this as part of mothballing - couldnt we just leave them powered off in their current locations?

Seems to me the plan should be to advertise a mothball date, then power these off in their current location at that advertised date in the future and see if anyone notices. If yes, obviously power back online and figure out what to do. However, if no-one notices, wait until some further advertised date to backup, scrub and recycle them. Does that seem reasonable?

ps: A quick look on bonsai shows no checkins on the 1.8.0 branch since 01dec2007, apart from our usual Build-version-bump checkins and one bugfix that was required for the TB1.5.0.14 release. http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_0_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-12-01+00%3A00&maxdate=2008-01-11+00%3A00&cvsroot=%2Fcvsroot



 
> (Side note: Do we have someone at Debian or other Linux vendors that might be
> able to speak to this?)
RedHat and Ubuntu have already confirmed they are not using any of these builds. I've not yet contacted Debian, but will do. 

What other forums should be notified? Newsgroups like dev-planning? The 1.8.0 tinderbox tree itself? 
(In reply to comment #2)
> Moving from internal build network to external community requires scrubbing
> keys, resetting logins, resetting IPs, with advertised downtimes etc.

I guess I assumed all that work was only applicable if we were giving other users access to these machines, which we wouldn't be. Regardless...

> Seems to me the plan should be to advertise a mothball date, then power these
> off in their current location at that advertised date in the future and see if
> anyone notices. If yes, obviously power back online and figure out what to do.
> However, if no-one notices, wait until some further advertised date to backup,
> scrub and recycle them. Does that seem reasonable?

That seems reasonable to me.

> RedHat and Ubuntu have already confirmed they are not using any of these
> builds. I've not yet contacted Debian, but will do.

Great! Glad they were contacted.

> What other forums should be notified? Newsgroups like dev-planning? The 1.8.0
> tinderbox tree itself? 

Definitely newsgroups. I'd say m.c.mozpad (in case a platform dev uses these), m.d.builds, m.d.planning, m.d.a.firefox, m.d.a.thunderbird, and m.d.a.calendar. It's a lot, but better safe than sorry. :) I'd send follow ups to m.d.planning, fwiw.

And as you suggested, a notice on the tree couldn't hurt.

Do you have a proposed date in mind?
(In reply to comment #2)
> From http://tinderbox.mozilla.org/showbuilds.cgi?tree=Mozilla1.8.0 it looks
> like the list of machines to be mothballed are:
> 
> crazyhorse (VM)
> moz180-linux-tbox (VM)
> moz180-win32-tbox (VM)
> tb180-win32-tbox (VM)
> xserve04 (physical hardware; rare PPC xserve machine)

A slight tweak on this: crazyhorse also does Thunderbird builds on the 1.8 branch, so we'd be commenting out the 1.8.0 builds in the multi-config.pl rather than turning it off.
(In reply to comment #3)
> (In reply to comment #2)
> Do you have a proposed date in mind?
How about 2 weeks notice before closing the machines, and powering them off. And then, starting from that, another 2 months notice before we recycled the hardware?
(In reply to comment #4)
> (In reply to comment #2)
> > From http://tinderbox.mozilla.org/showbuilds.cgi?tree=Mozilla1.8.0 it looks
> > like the list of machines to be mothballed are:
> > 
> > crazyhorse (VM)
> > moz180-linux-tbox (VM)
> > moz180-win32-tbox (VM)
> > tb180-win32-tbox (VM)
> > xserve04 (physical hardware; rare PPC xserve machine)
> 
> A slight tweak on this: crazyhorse also does Thunderbird builds on the 1.8
> branch, so we'd be commenting out the 1.8.0 builds in the multi-config.pl
> rather than turning it off.
Good catch, Nick. Thanks. You're 100% correct.
Assignee: nobody → joduinn
(In reply to comment #5)
> (In reply to comment #3)
> > (In reply to comment #2)
> > Do you have a proposed date in mind?
> How about 2 weeks notice before closing the machines, and powering them off.
> And then, starting from that, another 2 months notice before we recycled the
> hardware?
To be less vague, how about these specific dates?
- 14jan08: announce pending offline
- 28jan08: take machines offline
- 31mar08: scrub and recycle machines
(In reply to comment #3)
> (In reply to comment #2)
> > RedHat and Ubuntu have already confirmed they are not using any of these
> > builds. I've not yet contacted Debian, but will do.
> Great! Glad they were contacted.
Debian just confirmed they are not using any of these builds either.

Also, in dev.planning, confirmed that SeaMonkey 1.0 & 1.1 are not using these builds either.
Status: NEW → ASSIGNED
This morning, I announced the pending mothballing:
- at the weekly Mozilla lunchtime meeting. 
- on http://tinderbox.mozilla.org/showbuilds.cgi?tree=Mozilla1.8.0 
- in the following newsgroups: mozilla.community.mozpad,  mozilla.dev.builds,  mozilla.dev.planning, 	mozilla.dev.apps.calendar,  mozilla.dev.apps.firefox, 	mozilla.dev.apps.thunderbird.

Leaving this bug as P3, while we wait until the 28th.
Can we keep crazyhorse and moz180-linux-tbox around still for a bit?  I'm looking to take over the 1.8 branch (there's a discussion on security-group) since linux distributors still need to maintain these for a while longer. 

It would be helpful during our impending checkins to make sure that the 1.8 tree still stays sane.
The machines in question here are for the Firefox/Thunderbird 1.5 branches. AFAIK there is no plans to mothball 1.8 machines at this time.
(In reply to comment #12)
> comment 2 suggests otherwise
> 

I see lots of mentions of 1.8.0 and Firefox/Thunderbird 1.5.0.x, but nothing about 1.8 or Firefox/Thunderbird 2. I'm confused. You said you are "looking to take over the 1.8 branch". Did you mean the 1.8.0 branch?
Apologies, yes.  1.8.0.  (/me secretly hates that we used 1.8.0 and 1.8.1 for the two branches)
Oh ok, then please ignore my comments!
(In reply to comment #10)
> Can we keep crazyhorse and moz180-linux-tbox around still for a bit?  I'm
> looking to take over the 1.8 branch (there's a discussion on security-group)
> since linux distributors still need to maintain these for a while longer. 
> 
> It would be helpful during our impending checkins to make sure that the 1.8
> tree still stays sane.
I assume you mean the 1.8.0 branch; we're not mothballing any 1.8 machines.

Please clarify which linux distributers you mean - the ones we've already talked to were fine with mothballing (see comment#2, #8).
Of the distributions, RHEL, Debian, Ubuntu, and quite likely SLED still have to support 1.5.x for at least a while longer.  As of today, none of us have really been using these machines.

However, Alexander Sack (ubuntu) and I (Red Hat/Fedora) have specifically expressed interest in continuing to support this branch ourselves and we'd like to do so in the mozilla.org world so we don't have various forks of the browser going around.  It just occurred to me this morning that these machines would be extremely useful for us to start using in line with this transition, _especially_ in the coming days when we push through about 20-30 patches into the tree which could potentially cause build bustage and having these tinderboxes available still will be a huge help.
We need a plan to transition the 1.8.0 builds from MoCo machines to vendor-managed machines. Mozilla should keep the tinderbox.mozilla.org page itself running as a shared community resource as long as it is useful, but it doesn't need to run the build machines themselves (much as it doesn't run the machines on the ports page).

I don't think there was anything magic about the Jan 28 date, looks to me like a date was picked to prevent it bumbling along on its own forever. Now that we know we don't need a "turn the machines off and see if someone notices" date I would hope something can be worked out that gets the MoCo build team its machines back (comment 7 talks about March) without abruptly cutting off tinderbox service. Especially not if there's a large patch-backlog landing immanently.

Couple of weeks to bring up vendor-run build machines, a week or two of machine overlap, and then MoCo can mothball the VMs? I have no idea what's realistic, just throwing this out there.
dveditz: Yes, there is an important difference between mothballing the 1.8.0 pages on tinderbox.mozilla.org, and mothballing the 1.8.0 build machines which feed to the 1.8.0 pages on tinderbox. I was only asking about mothballing the 1.8.0 build machines. I was never suggesting mothballing the tinderbox pages. And both are quite different to supporting the 1.8.0 branch, which I understand some linux distros are interested in supporting. Maybe thats the source of confusion here.

Chris, Alexander: Does the distinction between those three topics above help clarify? If so, how much time would you need to bring up your own 1.8.0 build machines for this branch? We'd obviously want to take the time to make any transition as smooth as possible. As Dan guessed, the Jan28th date was only picked to bring focus to this matter. I note that we EOL'd Firefox1.5 in May2007, and EOL'd Thunderbird1.5 in Dec2007, so I feel we do need to escalate this transition somewhat. 
(In reply to comment #19)
> Chris, Alexander: Does the distinction between those three topics above help
> clarify? If so, how much time would you need to bring up your own 1.8.0 build
> machines for this branch? We'd obviously want to take the time to make any
> transition as smooth as possible. As Dan guessed, the Jan28th date was only
> picked to bring focus to this matter. I note that we EOL'd Firefox1.5 in
> May2007, and EOL'd Thunderbird1.5 in Dec2007, so I feel we do need to escalate
> this transition somewhat. 
> 
Christopher, Alexander: Does the above make sense? And if this means you need transition time, do you have any idea how much transition time, starting when?
Yeah, it makes sense.  I think we need some time to get our own tinderbox builds set up, so we aren't left with our pants down when we need to do security rebuilds off the branch.  I'm trying to figure out how quickly I can get this done, but would another 6-8 weeks be okay for now to allow me to start setting up a build machine?
(In reply to comment #21)
> Yeah, it makes sense.  I think we need some time to get our own tinderbox
> builds set up, so we aren't left with our pants down when we need to do
> security rebuilds off the branch.  I'm trying to figure out how quickly I can
> get this done, but would another 6-8 weeks be okay for now to allow me to start
> setting up a build machine?

Christopher: Certainly agree about the importance of a smooth transition, but this bug is already 6 weeks old, and *another* 6-8 weeks will have us supporting those machines until end-March/early-April!? :-( Is there anything we can do to reduce this setup time? For example, if you can get a physical box, we'd be happy to help walk you through any issues to speed up this transition.

Also, do you care about win32+mac+linux or only linux? For example, do you mind if we take the mac and win32 machines offline?
(In reply to comment #20)
> (In reply to comment #19)
> > Chris, Alexander: Does the distinction between those three topics above help
> > clarify? If so, how much time would you need to bring up your own 1.8.0 build
> > machines for this branch? We'd obviously want to take the time to make any
> > transition as smooth as possible. As Dan guessed, the Jan28th date was only
> > picked to bring focus to this matter. I note that we EOL'd Firefox1.5 in
> > May2007, and EOL'd Thunderbird1.5 in Dec2007, so I feel we do need to escalate
> > this transition somewhat. 
> > 
> Christopher, Alexander: Does the above make sense? And if this means you need
> transition time, do you have any idea how much transition time, starting when?
> 
Alexander: gentle ping... does this plan seem ok to you?
(In reply to comment #22)
> Christopher: Certainly agree about the importance of a smooth transition, but
> this bug is already 6 weeks old, and *another* 6-8 weeks will have us
> supporting those machines until end-March/early-April!? :-( Is there anything
> we can do to reduce this setup time? For example, if you can get a physical
> box, we'd be happy to help walk you through any issues to speed up this
> transition.

The bottleneck is getting the physical hardware.  I'm not going to go out and purchase a machine out of pocket for this, and getting hardware thru my employer isn't just something that happens overnight, sadly.  Setting up the tinderboxes should be relatively easy at that point.

> Also, do you care about win32+mac+linux or only linux?  For example, do you mind if we take the mac and win32 machines offline?

I'd like to at least make it so the win/mac trees compile in 1.8.0.x but feel free to take the win/mac boxes down, since they aren't my primary concern.
(In reply to comment #24)
> (In reply to comment #22)
> > Christopher: Certainly agree about the importance of a smooth transition, but
> > this bug is already 6 weeks old, and *another* 6-8 weeks will have us
> > supporting those machines until end-March/early-April!? :-( Is there anything
> > we can do to reduce this setup time? For example, if you can get a physical
> > box, we'd be happy to help walk you through any issues to speed up this
> > transition.
> The bottleneck is getting the physical hardware.  I'm not going to go out and
> purchase a machine out of pocket for this, and getting hardware thru my
> employer isn't just something that happens overnight, sadly.  Setting up the
> tinderboxes should be relatively easy at that point.

Ah...paperwork hurdles... :-( best of luck with that! No need to go out-of-pocket on this. Our intent here is to do a smooth transition, so we'll just have to keep handling pages as these machines flake out while you wrestle with administrivia! Let us know if your timeline moves out (or in!).


> > Also, do you care about win32+mac+linux or only linux?  For example, do you mind if we take the mac and win32 machines offline?
> I'd like to at least make it so the win/mac trees compile in 1.8.0.x but feel
> free to take the win/mac boxes down, since they aren't my primary concern.
If you are planning on bringing up linux+mac+win32 slaves in the next few weeks, we'll "do the right thing" and keep supporting all 3 o.s. as part of this transition. However, if you're planning on doing linux now and mac+win32 at some later point, I'd love to take you up on the offer, and shutdown the mac+win32 machines, as it will quickly help reduce our support headaches.
(In reply to comment #25)
> (In reply to comment #24)
> > (In reply to comment #22)
> > > Christopher: Certainly agree about the importance of a smooth transition, but
> > > this bug is already 6 weeks old, and *another* 6-8 weeks will have us
> > > supporting those machines until end-March/early-April!? :-( Is there anything
> > > we can do to reduce this setup time? For example, if you can get a physical
> > > box, we'd be happy to help walk you through any issues to speed up this
> > > transition.
> > The bottleneck is getting the physical hardware.  I'm not going to go out and
> > purchase a machine out of pocket for this, and getting hardware thru my
> > employer isn't just something that happens overnight, sadly.  Setting up the
> > tinderboxes should be relatively easy at that point.
> 
> Ah...paperwork hurdles... :-( best of luck with that! No need to go
> out-of-pocket on this. Our intent here is to do a smooth transition, so we'll
> just have to keep handling pages as these machines flake out while you wrestle
> with administrivia! Let us know if your timeline moves out (or in!).
> 
> 
> > > Also, do you care about win32+mac+linux or only linux?  For example, do you mind if we take the mac and win32 machines offline?
> > I'd like to at least make it so the win/mac trees compile in 1.8.0.x but feel
> > free to take the win/mac boxes down, since they aren't my primary concern.
> If you are planning on bringing up linux+mac+win32 slaves in the next few
> weeks, we'll "do the right thing" and keep supporting all 3 o.s. as part of
> this transition. However, if you're planning on doing linux now and mac+win32
> at some later point, I'd love to take you up on the offer, and shutdown the
> mac+win32 machines, as it will quickly help reduce our support headaches.

Christopher: Can I take you up on your offer to at least shutdown win32+mac builders? Most of my pages over the last 2 weeks have been with intermittent issues on the win32 slave, so if it doesn't complicate your life too much, this would be helpful to me.

(I'll still leave the linux builder running while you wrestle with corporate paperwork for new hardware!)
(In reply to comment #23)
> (In reply to comment #20)
> > (In reply to comment #19)
> > > Chris, Alexander: Does the distinction between those three topics above help
> > > clarify? If so, how much time would you need to bring up your own 1.8.0 build
> > > machines for this branch? We'd obviously want to take the time to make any
> > > transition as smooth as possible. As Dan guessed, the Jan28th date was only
> > > picked to bring focus to this matter. I note that we EOL'd Firefox1.5 in
> > > May2007, and EOL'd Thunderbird1.5 in Dec2007, so I feel we do need to escalate
> > > this transition somewhat. 
> > > 
> > Christopher, Alexander: Does the above make sense? And if this means you need
> > transition time, do you have any idea how much transition time, starting when?
> > 
> Alexander: gentle ping... does this plan seem ok to you?
> 
Alexander: can I assume you are ok with this plan?
i think we are fine with taking the win32+mac builds down now, though they are most likely hardest for us (linux distros) to get up on our own :)
Yeah, feel free to take down mac+win boxes.  It's more of a "nice to have" for me.  Ideally someone would step up to own those ports.
Turned off moz180-win32-tbox (VM) and tb180-win32-tbox (VM). Stopped tinderbox build processes running on xserve04 (physical hardware; rare PPC xserve machine). 

Also, removed all three from nagios in bug#421421.
Depends on: 421421
This removes the file-age checks on {firefox,thunderbird}/nightly/latest-mozilla1.8.0 for win32 and mac.

Checking in Firefox_mozilla1.8.0.txt;
/cvsroot/mozilla/tools/tinderbox-configs/monitoring/Firefox_mozilla1.8.0.txt,v  <--  Firefox_mozilla1.8.0.txt
new revision: 1.14; previous revision: 1.13
done
Checking in Thunderbird_mozilla1.8.0.txt;
/cvsroot/mozilla/tools/tinderbox-configs/monitoring/Thunderbird_mozilla1.8.0.txt,v  <--  Thunderbird_mozilla1.8.0.txt
new revision: 1.13; previous revision: 1.12
done
Are the l10n builds still necessary on the linux boxes ? If the goal in keeping the linux boxes running is to catch compile errors, then l10n is just creating a whole bunch of bits for us to virus scan.
(In reply to comment #33)
> No, you can drop those

Done. 
Christopher, Alexander: Are you working on both Firefox and Thunderbird? Or only on Firefox?

I ask because I'm trying to see if you need us to keep the linux Thunderbird machines running until you have your hardware online...or if we can stop "crazyhorse" from producing those Thunderbird builds continuously?
we are supporting thunderbird as well. So if possible I'd prefer if you'd shut them down in sync with the firefox ones.
ok, we will close the thunderbird linux machines at the same time as the firefox linux machines.

Which is a good time to ask: any update on those replacement firefox machines? Tomorrow will be 6 weeks since the "6-8 weeks" estimate. 
Christopher, Alexander: 

We are now 7 weeks since the "6-8 weeks" estimate, 3.5 months since EOL of Thunderbird 1.5 and 10 months since EOL of Firefox 1.5. Our machines are still building these EOL'd products daily (including weekends). 

Any update on your replacement machines? 
Christopher, Alexander: Its now 8 weeks since the "6-8 weeks" estimate, and 3 months since I filed this bug and we first talked about this. 

Do you have any info on your replacement machines, so I can finally mothball our EOL'd 1.8.0 machines?
Depends on: 425052
There's been no update in this bug, or checkins on the MOZILLA_1_8_0_BRANCH branch since 24mar. There's also been no reply to personal emails sent to Christopher / Alexander on 14apr.


I'm scheduling powering off these machines on 28apr2008; thats exactly three months after I first scheduled powering off these machines in comment#7.
From offline emails with Alexander, Blizzard, and others, we've agreed to power off these two machines this Friday (2nd May).
We waited an extra week, in case there were any last minute appeals. Still silence, so moz180-linux-tbox now powered off.
This morning, I also powered off crazyhorse, forgetting from comment#4, that crazyhorse was doing builds for 1.8.0 *and* for 1.8. 

Restarted crazyhorse, and just changed the multi-config.pl as attached. 

Despite a quick look around, I couldnt find where this file lives in cvs, so this attachment is just a diff of my before & after changes. This is not a real patch in the "cvs diff -u" sense.

Sorry if that seems confusing, but I wanted to note what the changes were. I'd be happy to redo this as a patch if appropriate.
Whiteboard: waiting on blockers
(In reply to comment #43)
> Despite a quick look around, I couldnt find where this file lives in cvs, so
> this attachment is just a diff of my before & after changes. This is not a real
> patch in the "cvs diff -u" sense.

That's fine, we don't store these files in CVS.
ok, to summarize:

crazyhorse (no longer running 1.8.0 builds)
moz180-linux-tbox (powered off)
moz180-win32-tbox (powered off)
tb180-win32-tbox (powered off)
xserve04 (no longer running 1.8.0 builds; rare PPC xserve machine being reimaged)

Removed these machines from tinderbox, nagios. 

At this point, all we have left to do here is backup and then delete/reformat. :-)
Whiteboard: waiting on blockers
I've cleaned out the nightly/latest-mozilla1.8.0 and tinderbox-builds/*-mozilla1.8.0* dirs for both Firefox and Thunderbird.
bm-xserve04 is now reimaged and running fine (see bug#414734 for details).

The following VMs have been backed up, so I've now deleted them:
moz180-linux-tbox
moz180-win32-tbox
tb180-win32-tbox

All done here, so closing.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.