Bug 1805138 Comment 0 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

On 2022-12-12 TreeHerder, acting as the `Intermittent Failures Robot` Bugzilla account, generated enough bugmail to trigger infrastructure alerts.  For just over 52 minutes it commented on a bug every 2 seconds, touching 1528 bugs and generating about 30k emails.

This is a significant amount of emails for Bugzilla's infrastructure to process, and triggered an alert that required attention from our SRE team.  While an alert was triggered, no infrastructure issues were noticed in this instance aside from delaying email delivery system-wide.  On Bugzilla's side we'll investigate if changes to our alerting can be implemented to avoid alerting if possible (there is value in raising awareness of slow email delivery).

I'd like for the TreeHerder team to consider what changes could be made to `intermittents_commenter` to reduce its impact on the service's health.

Possible solutions include:
- running the command more frequently, which will reduce the number of bugs updated at the same time
- add long delays between each bug update, which might give Bugzilla enough time to handle the emails generated by one bug before it has to handle the next
- prevent the account from generating emails when it updates a bug; this is a BMO admin setting that we can put in place for you and it prevent bugmail from being created from any change made by the bot
- consider the value of the comments and find an alternative method for surfacing information about intermittents, such as a dashboard
On 2022-12-12 TreeHerder, acting as the `Intermittent Failures Robot` Bugzilla account, generated enough bugmail to trigger infrastructure alerts.  For just over 52 minutes it commented on a bug every 2 seconds, touching 1528 bugs and generating about 30k emails.

This is a significant amount of emails for Bugzilla's infrastructure to process, and triggered an alert that required attention from our SRE team.  While an alert was triggered, no infrastructure issues were noticed in this instance aside from delaying email delivery system-wide.

On Bugzilla's side we'll investigate if changes to our alerting can be implemented to avoid alerting if possible (there is value in raising awareness of slow email delivery).  As TreeHerder's behaviour hasn't changed we'll also investigate why we're seeing slower email handling following BMO's migration from AWS to GCP.

In any event I'd like for the TreeHerder team to consider what changes could be made to `intermittents_commenter` to reduce its impact on the service's health.

Possible solutions include:
- running the command more frequently, which will reduce the number of bugs updated at the same time
- add long delays between each bug update, which might give Bugzilla enough time to handle the emails generated by one bug before it has to handle the next
- prevent the account from generating emails when it updates a bug; this is a BMO admin setting that we can put in place for you and it prevent bugmail from being created from any change made by the bot
- consider the value of the comments and find an alternative method for surfacing information about intermittents, such as a dashboard

Back to Bug 1805138 Comment 0