The default bug view has changed. See this FAQ.

Streamline release day throttling in Balrog for Firefox releases

RESOLVED FIXED

Status

Release Engineering
Balrog: Backend
RESOLVED FIXED
a year ago
5 months ago

People

(Reporter: bhearsum, Assigned: bhearsum)

Tracking

(Depends on: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(9 attachments)

(Assignee)

Description

a year ago
The key here is that we have a certain number of users we'd like to get onto a new version of Firefox (currently 20 million) after shipping, and then hold back updates for the remainder of users until we have feedback from the first batch. Currently this is done by going out at 25% for 24 hours, then throttling down to 0%. Various factors affect our ability to to hit this target and we want to improve our accuracy and consistency.

Some options that have been discussed are:
* Adding support for scheduled changes to rules so that we could queue up the existing rate changes and have them happen automatically.
* Adding support for applying throttle rates based on a function. Eg: Rate(t) = 3e-((x-2)^2/6^2)
* Adding support to offer a maximum of N updates for a particular rule, and then automatically throttle to 0%
* Talking with Telemetry to find out how many successful update requests have happened.

The last option is most likely the most precise as it would have as counting successful updates rather than guessing based on a period of time or number of offers made.
bug 1240522 might be interesting to look at for Telemetry. It's intended for choosing partials on ship-it but might be reusable.
(Assignee)

Comment 2

a year ago
(In reply to Nick Thomas [:nthomas] from comment #1)
> bug 1240522 might be interesting to look at for Telemetry. It's intended for
> choosing partials on ship-it but might be reusable.

Good point, I'll have a look. That bug says it's going to do real time analysis, which is very promising.
(Assignee)

Updated

a year ago
Depends on: 1240522
(Assignee)

Comment 3

a year ago
Nick and I had a big long chat about this today. Lot's of great questions and ideas came up which I'm going to attempt to summarize.

I came up with this design over the past couple of days, which we used as a starting point: http://people.mozilla.org/~bhearsum/sattap/9d914790.png. It describes a system that allows for the scheduling of arbitrary changes (I've been using the term "triggers") to Rules based on different conditions (such as uptake or timestamps). My initial proposal suggested that a user would input the entire rule they want to have, as well the conditions on when the change should happen. A new component that I've been calling the "Balrog Agent" would consume these, monitor for the conditions (eg: watch Telemetry or the clock), and talk to the Balrog API to enact the changes at the appropriate time. A key part of this is that Agent would only referenece an existing scheduled change by id, which means it wouldn't be able to make arbitrary changes to Rules, merely implement changes that a user submitted (which have already gone through permissions checks for product, etc).

Things that came up were:
* What kind of credentials do we use for the Agent? Can we derive temporary credentials and somehow associate them with the user making the change?
** Might not be possible while we're still using LDAP auth, because Balrog can't create new users that can login. Might be able to add secondary auth methods after we move to CloudOps and control more of the stack.

* Some concern over changes being too magical or surprising.
** The Agent should send mail or otherwise notify when making changes to minimize surprise.

* What happens if a we never hit the conditions of a trigger? How long is too long to wait?
** Eg: it takes 2 weeks to hit 20M users on release. We've probably done something else in the meantime, it might be confusing for the trigger to do something later.
** Non-time based triggers should probably have an expiration time on them to minimize the chance of unexpected things happening.

* "Triggers" is probably a bad name (though I'll continue to use it in this comment), because it is unlikely to be implemented as database triggers, and could confusion to unfamiliar folks (especially DBAs).

* Permissions should probably be verified before enacting a change, just in a case the user who created the trigger had their permissions revoked between creation and firing time.
** Something that didn't come up while we talked, but I realized while writing up these notes is that this second permissions checking might be quite difficult with the current model of ACLs that are based on API endpoints. Because the Agent is not going to be using the existing /rules/:id (et. al) endpoints we might not know what ACLs are needed. Maybe we can require admin level access for a product to create triggers?

* Lots of talk about what to do when rules change between trigger creation and firing time.
** Should we completely overwrite the rule with the values from the trigger, or throw an error?
** Should the trigger only store the actual parts of the rule that they want to change, and only throw an error if that conflicts?
** Strong emphasis that we need good UI around this, probably at least:
*** Warn users when creating a trigger for a rule that already has a pending trigger (if multiple triggers per rule are allowed at all)
*** Warn users when updating a rule that has a pending trigger (and possibly allow them to cancel the trigger?)
** We need to make some decisions about what we will and won't support, which should help guide us here.

* Do we allow multiple conditions on a trigger? If so, how do we store those and represent AND/OR in the db? Store as JSON? DSL in a TEXT column?

* Should look to see if there are other projects that have implemented something similar.
** Maybe a db-backed cron system?
** Event schedulers?
** Socorro may have some sort of "cron on steroids"?

* How do triggers integrate with "rule change simulation", which we've talked about maybe being implemented as "rule sets"? Do rule-level triggers make sense in that world? Is there a different way to implement rule change simulation?

* Can we run multiple copies of the agent for redundancy? Are there concurrency issues to worry about?

* Probably should keep expired triggers around.
** Maybe use a complete_at column or something similar to distinguish between active and not.


The next step here is for me to try to come up with a more concrete design, and then I'd like to try to pick holes in that with a bit larger of a group.
(Assignee)

Comment 4

a year ago
I'm going to be replacing the term "trigger" with "scheduled change" from now on (though that's not necessarily the final term). With that in mind....

(In reply to Ben Hearsum (:bhearsum) from comment #3)
> * Lots of talk about what to do when rules change between trigger creation
> and firing time.
> ** Should we completely overwrite the rule with the values from the trigger,
> or throw an error?
> ** Should the trigger only store the actual parts of the rule that they want
> to change, and only throw an error if that conflicts?
> ** Strong emphasis that we need good UI around this, probably at least:
> *** Warn users when creating a trigger for a rule that already has a pending
> trigger (if multiple triggers per rule are allowed at all)
> *** Warn users when updating a rule that has a pending trigger (and possibly
> allow them to cancel the trigger?)

I I did some more thinking about this today, and I think I've come up with a solution that strikes a good balance between UX and safety.

First off, I think it's best that scheduled changes always store the entire new value of the rule they will be updating. Storing only partial rules means it's difficult to distinguish between a scheduled change that wants to change a column to NULL, and one that doesn't want to change the column at all. The only way I could think of to deal with that is to have some sentinel string value that means "don't touch me". I think this is both ugly and unnecessarily restrictive (you would have to reject values for that column that match the sentinel value).

The downside to always storing the entire new rule is that it's easier for it to get out of date, but I think this can be mitigated by updating scheduled changes (after user sign off) if the rule they're associated with changes between scheduled change creation and execution. For example, if we start with a rule like:
rule_id: 2
Mapping: Firefox-43.0.3
Throttle: 0

And we schedule a change that would change it like this:
rule_id: 2
Mapping: Firefox-43.0.3
Throttle: 100

And then a user changes it like this:
rule_id: 2
Mapping: Firefox-44.0
Throttle: 0

...we could offer to update the scheduled change to:
rule_id: 2
Mapping: Firefox-44.0
Throttle: 100

There's also the case of a conflict to deal with, eg: the Throttle is different between the old, new, and scheduled change. There's no way to really guess smartly about what the value should be updated to in the scheduled change, so in this case we'd force the user to enter a new value entirely. This is very much like applying a diff and hitting a conflict.

I thought about the UI flow a little bit too, which ends up being relatively straightforward:
* When attempting to add a new scheduled change:
** If no other changes are scheduled for this rule, just save it
** If other changes are scheduled for this rule, show a page with their details, and let the user decide if they still want the new one. Maybe also also for editing the existing scheduled changes at the same time?

* When attempting to edit a rule
** If no changes are scheduled for the rule, just save it
** If changes are scheduled for the rule, show the same warning page as above.


And tangentially related:
> * Some concern over changes being too magical or surprising.
> ** The Agent should send mail or otherwise notify when making changes to minimize surprise.

Rail had a good about this today - and suggested that we mail in advance of making changes. Eg: for time based triggers mail N minutes/hours before making the change, as well as right before making it. For condition based triggers we could do similar and eg: send warning mail when we're at 50% of the target uptake. This may give us a chance to rectify an errant trigger before anything bad happens.
(Assignee)

Comment 5

a year ago
Nick and I spoke to kang and claudijd about this today, and they said that the security model of an agent that is only allowed to enact prescheduled changes (which have been previously authorized at submission time) is fine. A few random things that came up in and after that meeting:
* We should check permissions on the server just before enacting a change, to guard against enacting a change made by a user who no longer has permission to do so (Nick suggested this previously).
* Do we need the ability to schedule inserts to the rules table, or just modifications? We can probably hackily do "inserts" by precreating rules with priority 0, but if there's a good use case for them it may be worth building in.
* We should keep some sort of history of changes to scheduled changes. I think we can probably just enable the existing History mechanism to do this. Might want to careful what we call the columns in the scheduled change table to avoid conflicting with History columns.

I intend to roll this all up and publish a blog post with the plan very soon now, as it's fairly solidified.
(Assignee)

Comment 6

a year ago
(In reply to Ben Hearsum (:bhearsum) from comment #4)
> I'm going to be replacing the term "trigger" with "scheduled change" from
> now on (though that's not necessarily the final term). With that in mind....
> 
> (In reply to Ben Hearsum (:bhearsum) from comment #3)
> > * Lots of talk about what to do when rules change between trigger creation
> > and firing time.
> > ** Should we completely overwrite the rule with the values from the trigger,
> > or throw an error?
> > ** Should the trigger only store the actual parts of the rule that they want
> > to change, and only throw an error if that conflicts?
> > ** Strong emphasis that we need good UI around this, probably at least:
> > *** Warn users when creating a trigger for a rule that already has a pending
> > trigger (if multiple triggers per rule are allowed at all)
> > *** Warn users when updating a rule that has a pending trigger (and possibly
> > allow them to cancel the trigger?)
> 
> I I did some more thinking about this today, and I think I've come up with a
> solution that strikes a good balance between UX and safety.
> 
> First off, I think it's best that scheduled changes always store the entire
> new value of the rule they will be updating. Storing only partial rules
> means it's difficult to distinguish between a scheduled change that wants to
> change a column to NULL, and one that doesn't want to change the column at
> all. The only way I could think of to deal with that is to have some
> sentinel string value that means "don't touch me". I think this is both ugly
> and unnecessarily restrictive (you would have to reject values for that
> column that match the sentinel value).
> 
> The downside to always storing the entire new rule is that it's easier for
> it to get out of date, but I think this can be mitigated by updating
> scheduled changes (after user sign off) if the rule they're associated with
> changes between scheduled change creation and execution. For example, if we
> start with a rule like:
> rule_id: 2
> Mapping: Firefox-43.0.3
> Throttle: 0
> 
> And we schedule a change that would change it like this:
> rule_id: 2
> Mapping: Firefox-43.0.3
> Throttle: 100
> 
> And then a user changes it like this:
> rule_id: 2
> Mapping: Firefox-44.0
> Throttle: 0
> 
> ...we could offer to update the scheduled change to:
> rule_id: 2
> Mapping: Firefox-44.0
> Throttle: 100
> 
> There's also the case of a conflict to deal with, eg: the Throttle is
> different between the old, new, and scheduled change. There's no way to
> really guess smartly about what the value should be updated to in the
> scheduled change, so in this case we'd force the user to enter a new value
> entirely. This is very much like applying a diff and hitting a conflict.

This gets tricky when it comes to updates done through the API (eg: like what release automation does). There's no user, so no opportunity to show warnings or ask for confirmation. I think we're going to have to make a choice here between rejecting changes that would conflict with a scheduled change, or somehow marking those scheduled changes as "needing update" (and probably sending an alert about it).
(Assignee)

Comment 7

a year ago
(In reply to Ben Hearsum (:bhearsum) from comment #6)
> (In reply to Ben Hearsum (:bhearsum) from comment #4)
> > I'm going to be replacing the term "trigger" with "scheduled change" from
> > now on (though that's not necessarily the final term). With that in mind....
> > 
> > (In reply to Ben Hearsum (:bhearsum) from comment #3)
> > > * Lots of talk about what to do when rules change between trigger creation
> > > and firing time.
> > > ** Should we completely overwrite the rule with the values from the trigger,
> > > or throw an error?
> > > ** Should the trigger only store the actual parts of the rule that they want
> > > to change, and only throw an error if that conflicts?
> > > ** Strong emphasis that we need good UI around this, probably at least:
> > > *** Warn users when creating a trigger for a rule that already has a pending
> > > trigger (if multiple triggers per rule are allowed at all)
> > > *** Warn users when updating a rule that has a pending trigger (and possibly
> > > allow them to cancel the trigger?)
> > 
> > I I did some more thinking about this today, and I think I've come up with a
> > solution that strikes a good balance between UX and safety.
> > 
> > First off, I think it's best that scheduled changes always store the entire
> > new value of the rule they will be updating. Storing only partial rules
> > means it's difficult to distinguish between a scheduled change that wants to
> > change a column to NULL, and one that doesn't want to change the column at
> > all. The only way I could think of to deal with that is to have some
> > sentinel string value that means "don't touch me". I think this is both ugly
> > and unnecessarily restrictive (you would have to reject values for that
> > column that match the sentinel value).
> > 
> > The downside to always storing the entire new rule is that it's easier for
> > it to get out of date, but I think this can be mitigated by updating
> > scheduled changes (after user sign off) if the rule they're associated with
> > changes between scheduled change creation and execution. For example, if we
> > start with a rule like:
> > rule_id: 2
> > Mapping: Firefox-43.0.3
> > Throttle: 0
> > 
> > And we schedule a change that would change it like this:
> > rule_id: 2
> > Mapping: Firefox-43.0.3
> > Throttle: 100
> > 
> > And then a user changes it like this:
> > rule_id: 2
> > Mapping: Firefox-44.0
> > Throttle: 0
> > 
> > ...we could offer to update the scheduled change to:
> > rule_id: 2
> > Mapping: Firefox-44.0
> > Throttle: 100
> > 
> > There's also the case of a conflict to deal with, eg: the Throttle is
> > different between the old, new, and scheduled change. There's no way to
> > really guess smartly about what the value should be updated to in the
> > scheduled change, so in this case we'd force the user to enter a new value
> > entirely. This is very much like applying a diff and hitting a conflict.
> 
> This gets tricky when it comes to updates done through the API (eg: like
> what release automation does). There's no user, so no opportunity to show
> warnings or ask for confirmation. I think we're going to have to make a
> choice here between rejecting changes that would conflict with a scheduled
> change, or somehow marking those scheduled changes as "needing update" (and
> probably sending an alert about it).

I think we can probably still implement the UI described if the API rejects such changes, it just means putting a bit more smarts on the frontend...
(Assignee)

Comment 8

a year ago
I'm not as far along as I'd hoped because Balrog -> CloudOps migration has taken up a lot more time than I thought it would. This probably won't be ready this quarter, but I've been hacking away at as much as I can. At this point, I'm probably about 50% done the backend portion. After that, there's still the UI and Balrog Agent.

My WIP on the backend is at https://github.com/mozilla/balrog/compare/master...bhearsum:rule-changes?expand=1
(Assignee)

Comment 9

a year ago
Created attachment 8732974 [details]
WIP patch for backend

Nick, I'd love some early feedback on this, particularly the noted points in the PR.
Attachment #8732974 - Flags: feedback?(nthomas)
Similar to bug 1113689, it would be cool if there was some UI indication that a scheduled change is set up for a rule.
(Assignee)

Comment 11

a year ago
(In reply to Nick Thomas [:nthomas] from comment #10)
> Similar to bug 1113689, it would be cool if there was some UI indication
> that a scheduled change is set up for a rule.

That's a really great idea, I'll make sure I do it as part of the UI for this.
Sorry for the delay here, I'll try to take another swing at this with a fresh brain tomorrow.

Comment 13

11 months ago
Comment on attachment 8732974 [details]
WIP patch for backend

See PR for comments.
Attachment #8732974 - Flags: feedback?(nthomas) → feedback+
(Assignee)

Comment 14

11 months ago
Created attachment 8749799 [details] [review]
Move permission enforcement to database layer

Details in the PR.
Attachment #8749799 - Flags: review?(nthomas)

Updated

10 months ago
Attachment #8749799 - Flags: review?(nthomas) → review+
(Assignee)

Comment 15

10 months ago
I've been working on this pretty actively the past couple of weeks. At this point the work is divided into 5 parts:
1) Rework database permissions, to enable us to do proper permission checks at submission and enacting time of scheduled changes. This part is reviewed and ready to land, and I intend to push it shortly after 48.0b1 ships (possibly the week of June 13th). Code is at https://github.com/mozilla/balrog/pull/80

2) Implement the core scheduled changes logic - mainly the table that contains the models and logic. This part is complete, unless new features come up during development of other parts. Code is at https://github.com/mozilla/balrog/pull/63

3) Implement a web API to manage scheduled changes. This part is more or less complete as well - I suspect it's missing a piece or two that I haven't realized yet, though. Code is at https://github.com/bhearsum/balrog/tree/rule-changes-web

4) UI to manage scheduled changes. I'm actively working on this part, it still needs a bit more feature work as well as polish. Code is at https://github.com/mozilla/balrog-ui/compare/master...bhearsum:rule-changes-ui?expand=1

5) Balrog Agent - this is the daemon process that will enact scheduled changes when they're ready. I did some initial exploratory work on this awhile back, but haven't implemented most of the guts yet. Code is at https://github.com/mozilla/balrog/compare/master...bhearsum:balrog-agent?expand=1

I intend to land each of these pieces as they are ready to minimize risk and excess branch management.
(Assignee)

Comment 16

9 months ago
Created attachment 8769837 [details] [review]
core implementation of scheduled changes

As usual, details in the PR.
Attachment #8769837 - Flags: review?(nthomas)

Updated

9 months ago
Attachment #8769837 - Flags: review?(nthomas) → review+
(Assignee)

Updated

9 months ago
Depends on: 1287494
(Assignee)

Updated

8 months ago
Depends on: 1288814

Comment 17

8 months ago
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/998748294dbe4480dde9cc8a9fc6a16b27f0e52e
bug 1246675: move permissions enforcement to database layer (#80). r=nthomas
(Assignee)

Comment 18

8 months ago
Created attachment 8775638 [details] [review]
add api endpoints for managing scheduled rule changes
Attachment #8775638 - Flags: review?(nthomas)
(Assignee)

Comment 19

8 months ago
Created attachment 8775639 [details]
Balrog agent implementation
Attachment #8775639 - Flags: feedback?(rail)
(Assignee)

Comment 20

8 months ago
Things are now in a state where anyone should be able to run the Balrog agent locally if you use the "balrog-agent" branch of my balrog repo, and the "rule-changes-ui" branch of my UI. There's still a few things to fix up, but you should be able to add time based scheduled changes through the UI and see them enacted through the Agent.
Comment on attachment 8775638 [details] [review]
add api endpoints for managing scheduled rule changes

A few nits on the PR but r+
Attachment #8775638 - Flags: review?(nthomas) → review+
Comment on attachment 8775639 [details]
Balrog agent implementation

see my comments in the PR
Attachment #8775639 - Flags: feedback?(rail) → feedback+

Updated

8 months ago
Blocks: 1289822
(Assignee)

Updated

8 months ago
Depends on: 1295183

Comment 23

8 months ago
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/cffc810c6f8f5b205fd956602a89dacf97036ded
bug 1246675: implement core functionality needed for Scheduled Changes. r=nthomas

Comment 24

8 months ago
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/2240b5caa9168bd197b86294fc7852f978de4b20
bug 1246675: add API endpoints to manage scheduled rule changes. r=nthomas
(Assignee)

Comment 25

8 months ago
(In reply to Ben Hearsum (:bhearsum) from comment #15)
> I've been working on this pretty actively the past couple of weeks. At this
> point the work is divided into 5 parts:
> 1) Rework database permissions, to enable us to do proper permission checks
> at submission and enacting time of scheduled changes. This part is reviewed
> and ready to land, and I intend to push it shortly after 48.0b1 ships
> (possibly the week of June 13th). Code is at
> https://github.com/mozilla/balrog/pull/80

This part is now in production.

> 2) Implement the core scheduled changes logic - mainly the table that
> contains the models and logic. This part is complete, unless new features
> come up during development of other parts. Code is at
> https://github.com/mozilla/balrog/pull/63
>
> 3) Implement a web API to manage scheduled changes. This part is more or
> less complete as well - I suspect it's missing a piece or two that I haven't
> realized yet, though. Code is at
> https://github.com/bhearsum/balrog/tree/rule-changes-web

These are on master, and expected to be in production on Wednesday.

> 4) UI to manage scheduled changes. I'm actively working on this part, it
> still needs a bit more feature work as well as polish. Code is at
> https://github.com/mozilla/balrog-ui/compare/master...bhearsum:rule-changes-
> ui?expand=1

This has significant work done, but there's still a bit more to do. Also needs review.

> 5) Balrog Agent - this is the daemon process that will enact scheduled
> changes when they're ready. I did some initial exploratory work on this
> awhile back, but haven't implemented most of the guts yet. Code is at
> https://github.com/mozilla/balrog/compare/master...bhearsum:balrog-
> agent?expand=1

This is mostly done, and has gone through an initial review pass.
(Assignee)

Updated

8 months ago
Depends on: 1295678
(Assignee)

Comment 26

7 months ago
Created attachment 8783060 [details] [review]
UI for scheduled rule changes

Nick, I know frontend is not exactly your area of expertise, but I'm curious what you think at least from a usability standpoint. I can find someone else to do the actual code review if you'd prefer.
Attachment #8783060 - Flags: review?(nthomas)
Comment on attachment 8783060 [details] [review]
UI for scheduled rule changes

f+ from a UI behaviour POV.
Attachment #8783060 - Flags: review?(nthomas) → feedback+
(Assignee)

Updated

7 months ago
See Also: → bug 1297765
(Assignee)

Updated

7 months ago
Attachment #8775639 - Flags: review?(rail)
Comment on attachment 8775639 [details]
Balrog agent implementation

r+ with some nits in the PR.
Attachment #8775639 - Flags: review?(rail) → review+
(Assignee)

Comment 29

7 months ago
Comment on attachment 8783060 [details] [review]
UI for scheduled rule changes

Should be ready for final review now.
Attachment #8783060 - Flags: review?(nthomas)

Comment 30

7 months ago
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/4a6d3e389a0a5eb3b828da7fb8f1bafa6329f492
bug 1246675: fully functional balrog agent (#103). r=rail,aki
(Assignee)

Comment 31

7 months ago
(In reply to Ben Hearsum (:bhearsum) from comment #25)
> (In reply to Ben Hearsum (:bhearsum) from comment #15)
> > 5) Balrog Agent - this is the daemon process that will enact scheduled
> > changes when they're ready. I did some initial exploratory work on this
> > awhile back, but haven't implemented most of the guts yet. Code is at
> > https://github.com/mozilla/balrog/compare/master...bhearsum:balrog-
> > agent?expand=1
> 
> This is mostly done, and has gone through an initial review pass.

I've just merged this to master, and I'll be filing a bug for a stage deployment of it this week.
(Assignee)

Updated

7 months ago
Depends on: 1300100
(Assignee)

Updated

7 months ago
Depends on: 1300101
(Assignee)

Comment 32

7 months ago
Created attachment 8787687 [details] [review]
fix regression with admin permissions
Attachment #8787687 - Flags: review?(rail)
Attachment #8787687 - Flags: review?(rail) → review+
(Assignee)

Comment 33

7 months ago
We tried out the Agent in stage today, and discovered that because of the way we're deploying (on the same host as the admin app, without ldap auth between the two), we need to set REMOTE_USER directly instead of just the Authorization header. Because nginx isn't sitting between the Agent and the admin app, Authorization never gets checked and translated into REMOTE_USER.
(Assignee)

Comment 34

7 months ago
Created attachment 8789482 [details] [review]
don't allow changes to be scheduled in the past
Attachment #8789482 - Flags: review?(nthomas)

Updated

7 months ago
Attachment #8783060 - Flags: review?(nthomas) → review+

Updated

7 months ago
Attachment #8789482 - Flags: review?(nthomas) → review+

Comment 35

7 months ago
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/3bb36b43b5544680d1192eaebc44f472e703d030
bug 1246675: Don't allow changes in the past to be scheduled (#119). r=nthomas
(Assignee)

Comment 36

7 months ago
Every part identified in comment #15 has been landed. We're currently working through Balrog Agent deployment in bug 1300100, and I expect that, and the final backend patches, to be in production by early next week.

There's still one follow-up issue I need to address here around multiple scheduled changes for the same rule. Currently, there's nothing that prevents you from scheduling them, but whenever the first one is attempted to be enacted it will fail to merge with the second one. I'm still figuring out the best path forward here - we must do something though, even if it's just disallow multiple scheduled changes for one rule.
(Assignee)

Updated

7 months ago
Depends on: 1302450
(Assignee)

Comment 37

7 months ago
Created attachment 8790831 [details] [review]
don't allow multiple scheduled changes for the same PK
Attachment #8790831 - Flags: review?(nthomas)

Updated

7 months ago
Attachment #8790831 - Flags: review?(nthomas) → review+

Comment 38

7 months ago
Commit pushed to master at https://github.com/mozilla/balrog

https://github.com/mozilla/balrog/commit/269dc9da4f4de35fc7625b8fddf5e0a4092a2968
bug 1246675: don't allow multiple scheduled changes for a single PK (#123). r=nthomas
(Assignee)

Comment 39

7 months ago
(In reply to [github robot] from comment #38)
> Commit pushed to master at https://github.com/mozilla/balrog
> 
> https://github.com/mozilla/balrog/commit/
> 269dc9da4f4de35fc7625b8fddf5e0a4092a2968
> bug 1246675: don't allow multiple scheduled changes for a single PK (#123).
> r=nthomas

Everything except this is now in production, and verified to be working. We don't yet have support for Telemetry based changes, which is blocked on bug 1240522. I'm going to poke about that, but I'll likely end up filing a follow-up for this.
(Assignee)

Comment 40

7 months ago
I updated the docs with some details on the permissions changes, scheduled changes, and the Agent: https://wiki.mozilla.org/index.php?title=Balrog&diff=1147926&oldid=1147500
Do we have a bug on file for the mail notifications of scheduled changes being applied ?
(Assignee)

Comment 42

6 months ago
(In reply to Nick Thomas [:nthomas] from comment #41)
> Do we have a bug on file for the mail notifications of scheduled changes
> being applied ?

I was about to file this, but I realized that we're *supposed* to be sending mail about all changes to the Rules table already (per https://bugzilla.mozilla.org/show_bug.cgi?id=1251338). We've got all the code in there to support it, we just need to set it up per https://github.com/mozilla/balrog/blob/master/uwsgi/admin.wsgi#L45. I'll get a bug on file to enable this once I figured out where to send it.

Since this covers the rules table, this will mail whenever the change is applied. Could be that we should enable this for the scheduled changes table as well, though.
(Assignee)

Comment 43

6 months ago
(In reply to Ben Hearsum (:bhearsum) from comment #42)
> (In reply to Nick Thomas [:nthomas] from comment #41)
> > Do we have a bug on file for the mail notifications of scheduled changes
> > being applied ?
> 
> I was about to file this, but I realized that we're *supposed* to be sending
> mail about all changes to the Rules table already (per
> https://bugzilla.mozilla.org/show_bug.cgi?id=1251338). We've got all the
> code in there to support it, we just need to set it up per
> https://github.com/mozilla/balrog/blob/master/uwsgi/admin.wsgi#L45. I'll get
> a bug on file to enable this once I figured out where to send it.

https://bugzilla.mozilla.org/show_bug.cgi?id=1304082 - balrog-db-changes@mozilla.com.

> Since this covers the rules table, this will mail whenever the change is
> applied. Could be that we should enable this for the scheduled changes table
> as well, though.
Depends on: 1304082
(Assignee)

Comment 44

5 months ago
Scheduled Changes have been working in production for awhile now. We don't have support for uptake based scheduling yet, but that's because Telemtetry doesn't yet have low-latency ADI information (bug 1240522). We should follow-up with that when it does.
Status: NEW → RESOLVED
Last Resolved: 5 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.