The current expiry alerts are triggered rather late, apparently 1 day before merge day (and resulting version bumps). This leaves no time for engineers to react to changes and results in regular tree failures. Let's: - send those alerts earlier (proposing 6 weeks / one release cycle) - make sure they are also sent to all "alert_emails" entries for the probe Relatedly, we still need to make sure to send alert emails for scalars (bug 1306623) and events (bug 1345452).
Note for the triage: This is a real issue right now, causing sheriffs & engineers to jump in & fix things when this happens around merge day.
Component: Metrics: Pipeline → Monitoring & Alerting
Product: Cloud Services → Data Platform and Tools
We send these emails two weeks in advance. Long enough?
I'm just gonna say 2 weeks is long enough :)
Status: NEW → RESOLVED
Last Resolved: 11 months ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.