Closed
Bug 1307169
Opened 8 years ago
Closed 7 years ago
implement emergency shut-off
Categories
(Release Engineering Graveyard :: Applications: Balrog (backend), defect, P1)
Release Engineering Graveyard
Applications: Balrog (backend)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bhearsum, Assigned: asilva)
References
Details
(Whiteboard: [lang=python])
Currently, Balrog has a long standing rule without a product or version specified that points at No-Update. It normally sits at priority 0, but can be elevated to a higher one to quickly shut off ALL updates.
It was always a bit of a hack, and we've since outgrown it. Eg: using this existing rule shuts off GMP updates as well as Firefox ones, which can break Netflix for users who don't yet have Widevine installed. Additionally, the multiple sign off system that we're going to start working on soon will probably break the current hack.
I think we've reached the point where having a more formal emergency shut off is a good idea. It should have product, and maybe channel level granularity. Shutoff should not require multiple sign off, but it should send a notification that it happened.
Reporter | ||
Comment 1•8 years ago
|
||
Varun has been doing some thinking about this, and is planning to work on it.
Assignee: nobody → varunj.1011
Reporter | ||
Updated•8 years ago
|
Priority: -- → P3
Whiteboard: [lang=python]
Reporter | ||
Comment 2•8 years ago
|
||
Varun, are you still planning to look at this?
Flags: needinfo?(varunj.1011)
Comment 3•8 years ago
|
||
Sorry for the delay! Yes, I will look at it soon.
Flags: needinfo?(varunj.1011)
Reporter | ||
Comment 4•7 years ago
|
||
This is becoming more urgent now that multiple signoffs are in place for all channels. I think we should aim to fix this before the end of the year. We should probably have a look at the current design and make sure it's compatible with multiple signoffs before proceeding any further.
Priority: P3 → P2
Reporter | ||
Updated•7 years ago
|
Priority: P2 → P1
Reporter | ||
Comment 5•7 years ago
|
||
Varun and I talked about this awhile back. The idea at that point was to have a separate table that tracked whether or not a given product+channel's updates were currently disabled. This table would be exempt from multiple signoffs, and we'd add some new UI to control it. I suspect we'd probably want a button on the Rules page that shuts off or on updates for the currently selected product+channel.
One downside to this plan is that you wouldn't need multiple signoff to turn updates back on. I think that's probably OK for a initial implementation, we could always add it in later.
Reporter | ||
Comment 6•7 years ago
|
||
I spoke with Varun - he's not going to have time to look at this anytime soon. Thanks for your brainstorming on this, Varun!
Assignee: varunj.1011 → nobody
Reporter | ||
Comment 8•7 years ago
|
||
Catlee and I talked a little bit about this today. He reminded me that one of the crucial requirements here is that we need to be able to give folks who otherwise don't have access to Balrog the power to make an emergency shutoff. These folks shouldn't be able to turn updates back on, though - that should be left to folks who are more experienced and knowledgeable about Balrog and updates.
We also said that the UI for this should be very simple and fast to use. Because we're going to be doing these on a product+channel basis, perhaps it should be integrated into the Rules UI, and use the currently filtered product+channel?
Bonus points for shiny big red buttons and playing sounds like http://soundbible.com/1511-Fire-Truck-Siren.html as part of shutoff.
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → allan.tavares
Comment 9•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/balrog
https://github.com/mozilla/balrog/commit/515239f5b049fcb3bbd56ac80d646637f7d11e45
bug 1307169: Emergency shut off backend API (#450). r=bhearsum
Reporter | ||
Comment 10•7 years ago
|
||
The backend of this was implemented in https://github.com/mozilla/balrog/pull/450. I'm leaving this bug open to continue to track the frontend work taking place in https://github.com/mozilla/balrog/pull/460
Comment 11•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/balrog
https://github.com/mozilla/balrog/commit/de243f74670c8e6bf8058cb72f482a3f856a294f
bug 1307169: Emergency shut off frontend (#460). r=bhearsum
Reporter | ||
Comment 12•7 years ago
|
||
The UI hit production today. Are we all done with the Balrog side of things, Allan?
Flags: needinfo?(allan.tavares)
Assignee | ||
Comment 13•7 years ago
|
||
Cool! Yeah, all done in Balrog. It is necessary set "emergency_shutoff/delete" permission to balrogagent in prod, if it already not set.
We need to provide, to people that have no access to VPN, the ability to shut off updates. As response to an e-mail that I sent to RelEng team, Catlee and Jlund like the ideia of having a site in mozilla/services througth auth0 authentication.
Also, we have shutoff information available in Public API (https://aus5.mozilla.org/api/v1/emergency_shutoff). Nthomas suggests showing shutoff status in: - Balrog UI index and Delivery Dashboard (https://mozilla.github.io/delivery-dashboard).
Thanks again for mentoring, bhearsum!
Flags: needinfo?(allan.tavares)
Reporter | ||
Comment 14•7 years ago
|
||
(In reply to Allan [:asilva] from comment #13)
> Cool! Yeah, all done in Balrog.
Okay! I'm going to close this bug out then - we can track alternative UIs and clients in other bugs.
> It is necessary set
> "emergency_shutoff/delete" permission to balrogagent in prod, if it already
> not set.
Are you sure about this? As far as I can tell, the Agent only goes through the /scheduled_changes/emergency_shutoff endpoint, which eventually ends up checking for "scheduled_change" "enact" permission (https://github.com/mozilla/balrog/blob/master/auslib/db.py#L1269). I didn't add any other permissions in dev or staging, and it seemed to work OK there.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(allan.tavares)
Resolution: --- → FIXED
Assignee | ||
Comment 15•7 years ago
|
||
Ah, you right, I forgot this https://github.com/mozilla/balrog/blob/master/auslib/db.py#L1309
Validation is using the "Scheduled By" field.
Flags: needinfo?(allan.tavares)
Updated•5 years ago
|
Product: Release Engineering → Release Engineering Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•