Closed
Bug 711252
Opened 14 years ago
Closed 14 years ago
setup monitoring of release-drivers email list
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: joduinn, Unassigned)
References
Details
Followon from bug#711242.
Today, during the FF9.0b6 release, emails to release-drivers were blocked and not being sent - we only noticed this accidentally. Emails are now being sent again but the concern is that this could happen again without warning.
What monitoring is in place (or should be put in place) to make sure IT, release coordinators and RelEng are notified if this mailing list stops working or gets backlogged?
Comment 1•14 years ago
|
||
Monitoring mail is not a relops issue, moving to the correct queue.
Assignee: server-ops-releng → server-ops
Component: Server Operations: RelEng → Server Operations
QA Contact: zandr → cshields
Comment 2•14 years ago
|
||
We don't have mailing list-level monitoring in place (afaik), but it is possible to extend Nagios to do it. See ( http://exchange.nagios.org/directory/Plugins/Email-and-Groupware/Mailman/check_mailman_qfiles/details ) for example.
Looping in Justdave for his feedback.
Comment 3•14 years ago
|
||
mburns: we're actually already using that. The thresholds are currently set to --warning=20 --critical=40 (minutes). This is monitoring the Mailman job queues.
That this didn't alert before the bug was opened suggests that things were operating as expected. It just happened that overall mailing-list traffic was *extremely* high during the time bug 711242 (the first part) occurred, but postfix and mailman were chugging through that as expected -- mail is an asynchronous technology, after all.
In fact, things were back to normal soon after the mail volume on the other list was reduced.
So, this is monitored to the extent necessary already.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
Reporter | ||
Comment 4•14 years ago
|
||
In RelEng/IT meeting this morning, dustin agreed to also send nagios alerts for this to #buildduty. Per irc, this is now done.
(disclaimer: There are many possible other causes for mail delays to users on that mailing list, and this is only monitoring one of the possible causes. However, at least next time, if mailman is the problem, RelEng will know about this quickly, which matters during tight timing of a release. Obviously, if a mailing list delay happens again in a different area that we are not monitoring, we'll revisit in a new separate bug when that happens.)
Updated•10 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•