Update the expected value for the mozilla-central tree status alert

RESOLVED FIXED

Status

mozilla.org Graveyard
Server Operations
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: philor, Assigned: ashish)

Tracking

Details

(Whiteboard: :Moc)

(Reporter)

Description

4 years ago
<nagios-releng>: Fri 18:22:45 PDT [4994] treestatus.mozilla.org:Tree status - mozilla-central is CRITICAL: status: approval required != open () (http://m.mozilla.org/Tree+status+-+mozilla-central)

That's been the new expected status for the mozilla-central tree since June 2nd, so it's been alerting since then other than a time or two that I've rage-acked it. Might not be the highest-value alert around.

Comment 1

4 years ago
I don't believe these alerts have much value. Trees are often (if not over 50% of the time) closed for non-infrastructure reasons. Until I have a chance to double back to bug 931542, there isn't a satisfactory way for these alerts to be tied to the infra-only reasons.

So for now, we should either:
a) Disable the alerts entirely until bug 931542 is fixed.
b) Leave the alert enabled, but set the acceptable states as "open" and "approval required" for *all trees*. Since approval required is rarely used for infrastructure reasons.

My preference is for (a).
Blocks: 931079

Updated

4 years ago
Flags: needinfo?(ashish)
(Reporter)

Comment 2

4 years ago
acking it after every status change got boring, so I downtimed it for 100y.

Comment 3

4 years ago
I've downtimed the entire host, since we were still getting the pointless alerts for other trees (eg inbound):
downtime treestatus.mozilla.org 9999999d bug 1025401
The check was intended for IT to know when the trees were closed, not the sheriffs or buildduty. I've changed the alerting group to be the #sysadmins channel only.

I've also modified the check to set the default status of mozilla-central as "approval required"

Ashish: you may want to change the check to allow multi-value for expected state so one can look for EITHER "open" or "approval required" as acceptable states.

Comment 5

4 years ago
(In reply to Amy Rich [:arich] [:arr] from comment #4)
> The check was intended for IT to know when the trees were closed, not the
> sheriffs or buildduty. 

Yeah, but due to comment 1 I don't think it's currently useful for IT either, given that 90% of the alerts are noise from non-infra tree closures. Note also that this alert only goes off after a human has manually closed a tree (the trees can't close on their own due to other alerts etc) - so that human will normally file an IT bug if the problem is IT.

> I've changed the alerting group to be the #sysadmins
> channel only.
> 
> I've also modified the check to set the default status of mozilla-central as
> "approval required"

That's great - thank you :-)
Blocks: 993044
Whiteboard: :Moc

Updated

4 years ago
Flags: needinfo?(ashish)
(Assignee)

Comment 6

4 years ago
Fixing this the "right way", as :arr mentioned in Comment 4...
Assignee: server-ops → ashish
Status: NEW → ASSIGNED
(Assignee)

Comment 7

4 years ago
Alright, I've modified the script to check the output against regexes. So the check will not alert if mozilla-central's status is "open" or "approval required".
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.