Closed Bug 627821 Opened 12 years ago Closed 11 years ago

Relax nagios checks on dm-wwwbuild01 file age

Categories

(Infrastructure & Operations :: RelOps: General, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: zandr, Assigned: rtucker)

References

Details

These checks are very noisy these days, since the files aren't always getting built and uploaded in time.

bug 625978 is to fix the root cause.

In the mean time, we should relax the age check to 15 minutes to reduce nagios chatter.
Did this already happen ?
Assignee: server-ops-releng → rtucker
These service checks are commented out:
#http_age&contact_groups build::dm-wwwbuild01:build.mozilla.org:/builds/builds-running.js!7m
#http_age&contact_groups build::dm-wwwbuild01:build.mozilla.org:/builds/builds-pending.js!7m
#http_age&contact_groups build::dm-wwwbuild01:build.mozilla.org:/builds/builds-4hr.js.gz!7m

Would you like one of them to be enabled and set to a 15 minute threshold?
I found them. They are hardcoded into the services.cfg file as opposed to how we usually do these in a generated way. Do you want me to change all of them to a different value? If so what value?
We haven't had much flapping recently, but it's somewhat dependent on our VM and the DB server in ways we don't yet understand. And load on the buildbot cluster has been low while we're all frozen for the 4.0 releases, so we'll have to see what happens as everything cranks back up.

Which is a long way of saying it depends what value they all have at the moment. Could you look it up ?
They are currently set to 7 minutes.
Lets leave them at 7 minutes. They haven't been flapping recently, and anyway we want this information to be timely. Plus, we know there's a leak somewhere in the buildapi code which provides the files we're monitoring, and the sooner we know that's slowing things down the better.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.