Closed Bug 748369 Opened 12 years ago Closed 12 years ago

Pull old preproduction machines

Categories

(Infrastructure & Operations :: RelOps: General, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rail, Assigned: arich)

References

Details

preproduction-stage.build.sjc1.mozilla.com.
preproduction-master.build.sjc1.mozilla.com.

These VMs can be shut down and removed from monitoring.

Thanks in advance.
The replacements aren't responding correctly to nagios checks yet.  Have they been finished?
Assignee: server-ops-releng → arich
I've shut down the old vms.  Nagios is still not functioning for the new ones, nor was there a request to move the dns CNAME for the bmo subdomain, so I haven't done that, either.
Depends on: 750280
(In reply to Amy Rich [:arich] [:arr] from comment #2)
> I've shut down the old vms.  Nagios is still not functioning for the new
> ones, nor was there a request to move the dns CNAME for the bmo subdomain,
> so I haven't done that, either.

Now I can see nrpe daemons running on both machines. Is there something else needed to be done to enable nagios checks?
The master is still failing several of its checks for various reasons:

https://nagios.mozilla.org/nagios/cgi-bin/status.cgi?host=preproduction-master.srv.releng.scl3

The NRPE timeout for mysql is probably a misconfiguration for mysql or a missing flow. Not sure what the errors are with the queue checks (ask catlee?), and there are too many buildbot processes running.

And the storage vm is missing a number of check definitions:

https://nagios.mozilla.org/nagios/cgi-bin/status.cgi?host=preproduction-stage.srv.releng.scl3

Those should be in puppet if they're not.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.