Closed Bug 435068 Opened 12 years ago Closed 11 years ago

confirm all new moz2 builders/unittest machines on nagios

Categories

(Release Engineering :: General, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: joduinn, Assigned: bhearsum)

References

Details

Attachments

(1 file)

In moz2 meeting today, some unittest machines reported offline over weekend before someone manually noticed. 

Should do complete sweep of all moz2/3.next machines, to ensure they are all correctly monitored by nagios.
Priority: -- → P1
For the record, dbaron was talking about Nagios monitoring on the Tinderbox Waterfall, so when something falls off we get alerts like this:
<nagios> surf:tree - Mozilla1.8 is CRITICAL: 

The configs for that stuff is here:
http://mxr.mozilla.org/mozilla/source/tools/tinderbox-configs/monitoring/


(We should definitely still be doing monitoring on the machines themselves though.)
Depends on: 435216
Assignee: nobody → bhearsum
Attachment #322972 - Flags: review?(nrthomas) → review+
Comment on attachment 322972 [details] [diff] [review]
[checked in] files needed for surf:Nightly checks for mozilla-central and actionmonkey

Checking in Firefox_actionmonkey.txt;
/cvsroot/mozilla/tools/tinderbox-configs/monitoring/Firefox_actionmonkey.txt,v  <--  Firefox_actionmonkey.txt
initial revision: 1.1
done
RCS file: /cvsroot/mozilla/tools/tinderbox-configs/monitoring/Firefox_mozilla-central.txt,v
done
Checking in Firefox_mozilla-central.txt;
/cvsroot/mozilla/tools/tinderbox-configs/monitoring/Firefox_mozilla-central.txt,v  <--  Firefox_mozilla-central.txt
initial revision: 1.1
done
Attachment #322972 - Attachment description: files needed for surf:Nightly checks for mozilla-central and actionmonkey → [checked in] files needed for surf:Nightly checks for mozilla-central and actionmonkey
Depends on: 436354
(In reply to comment #0)
> Should do complete sweep of all moz2/3.next machines, to ensure they are all
> correctly monitored by nagios.
> 
ok, done a sweep of all cvs-trunk and moz2 machines. The few not already
monitored by nagios are being added in bug#436429.
Okay, reed setup surf:Tinderbox and surf:Nightly. We'll need to create Tier1_Mozilla2.txt and Tier1_Actionmonkey.txt after we work out tier1 support with IT.

This isn't blocking opening of mozilla-central anymore, though.
Blocks: 433384
No longer blocks: 422754
Status: NEW → ASSIGNED
Priority: P1 → P3
Depends on: 440384
What's the status of IT support for the mozilla-central/actionmonkey Buildbot?
We're going to have to update a lot of tinderbox monitoring files when we move machines over.
Depends on: 441945
I've just added support doc links for all of the moz2/3.next machines in the inventory app. They point to the links here: http://wiki.mozilla.org/ReleaseEngineering:ITSupport#Moz_2_2

Mrz, Justin said that you need to OK these before we consider them Tier1, can you have a look?
From a RelEng standpoint all of the necessary nagios monitoring is in place. I'm going to assume that this is the case for IT as well. Closing this bug now.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.