Status

mozilla.org Graveyard
Server Operations
--
enhancement
RESOLVED FIXED
8 years ago
3 years ago

People

(Reporter: justdave, Assigned: fox2mike)

Tracking

Dependency tree / graph
Bug Flags:
needs-downtime +

Details

(Whiteboard: Tuesday Nov. 17th 6am-9am PST)

Surf got left out of our attempts to upgrade everything to RHEL5 a while back, for the sole reason that we were intending to replace the machine with dm-stage01/02 and figured it would just pick up the upgrade as part of being decommissioned or reused.

However, dm-stage01/02 haven't panned out yet, and it may be a while, so there's no reason to hold off on it.  This is one of the last machines we have left still on RHEL4.

This would require about an hour of downtime for stage.mozilla.org, and has the potential for their to be weird small problems cropping up over the next day or so after the upgrade while we fine-tune things that may work "differently" in RHEL5.
Flags: needs-downtime+
Benefits of upgrading:

1) Our new backup stuff may or may not work on RHEL4, and will probably require hacking to get it to work there, but we know it works on RHEL5.

2) On RHEL5 we'll have the capability to lock down what you can do in a shell on stage.mozilla.org (so that you can scp, rsync, sftp, etc, but you can't get an interactive shell on a normal user account) which is something we've wanted to do for a while.

3) Many general performance improvements and easier package management.
From a RelEng point of view we should pick a downtime away from any releases, so the next couple of weeks would be fine. I can be available to help verify that it's working afterward, assuming you're thinking of PDT evening downtime.
(In reply to comment #1)
> Benefits of upgrading:
> 
> 1) Our new backup stuff may or may not work on RHEL4, and will probably require
> hacking to get it to work there, but we know it works on RHEL5.
> 
> 2) On RHEL5 we'll have the capability to lock down what you can do in a shell
> on stage.mozilla.org (so that you can scp, rsync, sftp, etc, but you can't get
> an interactive shell on a normal user account) which is something we've wanted
> to do for a while.

Can we please make sure this is a separate item? We've been pretty good about not adding new things that depend on the shell, but there's a ton of existing stuff that would need to be changed before we can do this on ffxbld/cltbld accounts.
It's already a separate item.  It's not part of this, this is just a prerequisite for it.

Updated

8 years ago
Severity: minor → enhancement
(In reply to comment #4)
> It's already a separate item.  It's not part of this, this is just a
> prerequisite for it.

Thanks for the clarification

Comment 6

8 years ago
(In reply to comment #3)
> [...], but there's a ton of existing
> stuff that would need to be changed before we can do this on ffxbld/cltbld
> accounts.

As a note, seabld is one of those accounts as well, as well as the Thunderbird one (tbirdbld?) - is there a bug for that further step?
(In reply to comment #6)
> As a note, seabld is one of those accounts as well, as well as the Thunderbird
> one (tbirdbld?) - is there a bug for that further step?

I'll file one when we decide to go ahead with it (and make sure it gets trumpeted loudly well in advance on dev-planning and planet and so forth).  The existing bug kind of has it lumped in with the stage.m.o rebuild, which is indefinitely stalled at the moment.
Assignee: server-ops → justdave

Comment 8

8 years ago
build, when's a good time to schedule this?  pass over when ready.
Assignee: justdave → nobody
Component: Server Operations → Release Engineering
Flags: needs-downtime+
QA Contact: mrz → release

Comment 9

8 years ago
I sent mail out to releng last night, and due to overwhelming ambivalence, we've decided upon:

Tuesday Nov. 17th 6am-9am PST

with a back-up choice of:

Thursday Nov. 19th 6am-9am PST

Let us know if that doesn't work for IT.
Assignee: nobody → server-ops
Component: Release Engineering → Server Operations
QA Contact: release → mrz
works for me.
Assignee: server-ops → justdave
Flags: needs-downtime+
Whiteboard: Tuesday Nov. 17th 6am-9am PST
(In reply to comment #10)
> works for me.

Do you want me to send out a downtime notice for this, or were you planning to?
(Assignee)

Comment 12

8 years ago
I'll be doing the actual upgrade work. I plan to start at 0600 PST and will be in the usual channels in case people need to get in touch :)
Assignee: justdave → shyam
Downtime notice has been sent to the usual places.  Separate copy with different wording specific to the mirror network has been sent to the mirrors list.
status: NFS mounts failed to happen when the machine was booted after the upgrade.  Unfortunately, this didn't get caught until after the rsync hubs had all rsynced out the now-empty docroot, so the releases site is kind of hosed at the moment.  Working on trying to get the nfs mounts back online.
(Assignee)

Updated

8 years ago
Blocks: 529313
Blocks: 529375
(Assignee)

Comment 15

8 years ago
I think we can safely call this done. Thanks to Dave for the help!
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.