Closed
Bug 748563
Opened 13 years ago
Closed 13 years ago
[briar-patch] Give concise, human-readable next steps for slaves needing recovery in the kitten emails
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: coop, Assigned: bear)
References
Details
(Whiteboard: [briarpatch][capacity][buildslaves][reporting])
I posted about Facebook's auto-remediation system back in the fall:
https://www.facebook.com/notes/facebook-engineering/making-facebook-self-healing/10150275248698920
As much as possible, I want briar-patch to be working towards this goal.
In almost all cases, we know what the next steps are that a human should take for a particular slave. At the very least, we know a first step, e.g. "Is this slave enabled in slavealloc?" Rather than display a list of previous states in the kitten emails, let's map those to a human action, and (where possible) provide a link for someone (buildduty) to get started performing that action.
For example:
I know that the kitten report is in some state of flux right now with the colo move, but two win64 slaves were consistently appearing in the "previously seen" category. By logging into those slaves, I was able to determine that auto-logon was not setup on these slaves, so buildbot was never getting the chance to start. That kind of information should live in a state matrix somewhere so the next time a win64 slave enters that state, we can tell buildduty (Please VNC into this host to make sure auto-logon is setup.").
In short, I want the report to show me actionable work, with a link to the dashboard that will to allow me to drill-down and get to the info that the report currently displays.
| Reporter | ||
Comment 1•13 years ago
|
||
I've made a first stab at HTML mail with some bug links here:
https://github.com/ccooper/briar-patch/commit/ec362f277659a24e151dc239937a5bb26a3ea4eb
| Assignee | ||
Comment 2•13 years ago
|
||
coop's changes have been merged and tested - doing a test run now on staging
https://github.com/mozilla/briar-patch/commit/ca1a41b981b72d7382283b1da6054be3155ad948
| Reporter | ||
Updated•13 years ago
|
Whiteboard: [briar-patch][capacity][buildslaves][reporting] → [briarpatch][capacity][buildslaves][reporting]
| Reporter | ||
Updated•13 years ago
|
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•