Closed Bug 1502410 Opened 6 years ago Closed 6 years ago

On preference experiment expiration, app.normandy.startupExperimentPrefs.* is not removed

Categories

(Firefox :: Normandy Client, defect, P1)

defect

Tracking

()

VERIFIED FIXED
Firefox 65
Tracking Status
firefox-esr60 --- unaffected
firefox63 blocking verified
firefox64 + verified
firefox65 + verified

People

(Reporter: aflorinescu, Assigned: mythmon)

References

Details

(Keywords: regression)

Attachments

(1 file, 1 obsolete file)

[Description:]
Once a client gets enrolled into a preference experiment, Normandy creates a preference: app.normandy.startupExperimentPrefs.experiment_preference_name and stores the experiment temporary value in it in order for it to be available in early start-up. 

This is mainly due to the fact that the preference experiment are still remote actions and not local actions.

In a normal behavior, the app.normandy.startupExperimentPrefs.experiment_preference_name should be removed once the experiment expires.

[Affected versions]:
Release 63.0    
Beta 64.0b3 2018-10-22 
Nightly 65 2018-10-26




[Affected platforms]:
- most likely all, tested on Windows 10/ Windows 8.1

[Pre-conditions:]
user_pref("security.content.signature.root_hash", "DB:74:CE:58:E4:F9:D0:9E:E0:42:36:BE:6C:C5:C4:F6:6A:E7:74:7D:C0:21:42:7A:03:BC:2F:57:0C:8B:9B:90");
user_pref("app.normandy.api_url", "https://normandy.stage.mozaws.net/api/v1");
user_pref("app.normandy.dev_mode", true);
user_pref("app.normandy.logging.level", 0);


[Steps to reproduce]:
1. Create/Publish a pref-experiment that sets an existent int preference.
2. Run a firefox client with the pre-conditions.
3. From delivery console disable the pref-experiment
4. Restart the client. 
5. Restart the client.

[Actual Result:]

2. Pref-experiment is applied to the client.
4. Upon restart, pref-experiment expires and the value for the preference is set to the initial default value.
app.normandy.startupExperimentPrefs.experiment_preference_name is not removed.
5. Upon the 2nd restart, the value for the preference that has been involved into the preference experiment is now re-set to the experiment value, due to the fact that app.normandy.startupExperimentPrefs.experiment_preference_name was not removed.

[Expected Result:]
The expected result is that upon experiment expiration, the experimental value is reset to initial value and the app.normandy.startupExperimentPrefs.experiment_preference_name is removed.

[Regression Range:]
Since for Normandy regression ranges, mozregression is not usable, a manual regression range was attempted and we managed to reduce the regression range to this:

last good build: 63.0a1 (2018-07-14)
first bad build: 63.0a1 (2018-07-15)

https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=105+days+ago&enddate=103+days+ago

The most likely candidate from the above pushlog to introduce this regression in my oppinion might be bug 1471025.

[User Impact:]
This bug causes users that have been involved into a preference experiment to have the experimental value set again after the first restart after the experiment has expired.
Keywords: regression
> This is mainly due to the fact that the preference experiment are still remote actions and not local actions.

This isn't true. The system of storing values in temporary preferences available during early startup is not going to change when we convert preference experiments to a local action. This system is required to make durable changes to the default preference branch, regardless of what mechanism we use. This would require major changes to the preference system to change.
I'm now fairly convinced that this is a dupe of bug 1502078. Until we have a fix though, I'll leave them both open.

In exploring this I learned something interesting. In the original STR, the "restart" step is a normal step in these bugs to cause Normandy to easily run again. However, in my testing it is also required to manifest the bug. When I enroll and then unenroll without a restart in between, the problem is not reproduced. I'm not sure what this means, but it is an interesting clue.
With bug 1471025, preferences are no longer simply stored in a hash
table; they're now also stored in a section of immutable shared memory
that is available to child processes. This means that deleting
preferences is more complicated than it was before. ClearUserPref
seems to have been updated, but DeleteBranch seems not to be. Update
DeleteBranch to just explicitly use ClearUserPref here.
Per some discussion on IRC, both :mythmon and I have seen that any pref applied by Normandy during startup is subsequently impossible to delete using `deleteBranch`. I think the explanation for what's going on is in https://bugzilla.mozilla.org/show_bug.cgi?id=1471025#c95 :

> As soon as we're ready to launch a content process, the full state of the hash
> table is serialized to a snapshot, which is stored in a shared memory region
> managed by a SharedPrefMap instance. The original hash table is then cleared,
> and used only to store changed preference values.
> 
> [...]
> 
> Preference deletions:
> ---------------------
> 
> Deleting preferences is slightly more complicated. If a preference to be
> deleted exists only in the dynamic hash table, its entry can simply be
> removed. If it exists in the shared snapshot, however, its dynamic hash table
> entry needs to be kept (or created), and its type changed to None. The
> presence of this entry masks the snapshot entry, and is ignored by preference
> enumerators.

It looks like the code to `ClearUserPref` got updated in this way, but `DeleteBranch` never was.

I tried to write a patch for this and I got something that seems to work but I haven't even run the test suite yet. I'm hoping for early feedback from :kmag, including a good way to write a test for this. I guess this probably isn't correct for something like `Services.prefs.getDefaultBranch("").deleteBranch("app.normandy.startupExperimentPrefs.")` ?
Given that Ethan has traced the problem to `deleteBranch`, I prepared a proof-of-concept patch that avoids using it in most places in Normandy. It isn't possible to avoid in all places, but this should at least be enough to confirm the problem. Ultimately the better solution is more along the lines of Ethan's fix to the root cause.

I've pushed this patch to Try here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=942f1fb2f0c1e8e776e79d240d273c812da3a2fa

Adrian, can you try and replicate this bug and bug 1502078 with the builds from that try run? It seems to work for me locally.
Flags: needinfo?(adrian.florinescu)
[Tracking Requested - why for this release]:

Bug 1502078 was marked as tracking=blocking for Firefox 63. I believe these two bugs are duplicates of eachother, and we should transfer the tracking flags to this bug.
Using the try build from comment 5, this issue seems to be fixed:
 - app.normandy.startupExperimentPrefs.experiment_preference_name is removed after first restart
 - pref-experiment value is correctly set to the original value as early as the first restart
Flags: needinfo?(adrian.florinescu)
Set tracking to blocking for 63.0.2
Assignee: nobody → eglassercamp
Flagging :kmag for more opinions about how I should go about writing this.
Flags: needinfo?(kmaglione+bmo)
Priority: -- → P1
The simplest thing to do would be just using getChildList and calling clearUserPref on the results.
Flags: needinfo?(kmaglione+bmo)
(In reply to Kris Maglione [:kmag] from comment #11)
> The simplest thing to do would be just using getChildList and calling
> clearUserPref on the results.

This works if we are trying to clear the user pref. It doesn't work when the goal is to delete a preference on the default branch. That's which is something we are doing, as I explained in Phabricator [0]. The en-masse deletion isn't really that important to me, since getChildList is easy enough to use. The ability to undo the addition of a preference on the default branch is the important part.

[0]: https://phabricator.services.mozilla.com/D10291#247021
Flags: needinfo?(kmaglione+bmo)
Preferences in the default branch are removed on browser restart. Is there a particular reason you need to remove them more quickly than that?
Flags: needinfo?(kmaglione+bmo)
For some Normandy actions, we create new preferences on the default branch at run time. We remove them in response to unenrolling a user from an experiment or rolling back a rollout. Sometimes this unenrollment happens because of user action, in which case it should occur immediately. Other times we do this because the experiment has reached its natural end or the user is no longer applicable to a given client. If Normandy now must wait until the next browser restart, it would represent a major regression in responsiveness.

Besides Normandy needs, the documentation says that this should work. Either the documentation should be updated, or this function should be fixed to match the documentation.
(In reply to Michael Cooper [:mythmon] from comment #14)
> For some Normandy actions, we create new preferences on the default branch
> at run time.

Why?
It's a behavior allowed by the preference system, and it is a useful capability to have while designing experiments. If you would like to officially deprecate this feature, instead of leaving it in a broken state, we can have that discussion. As it stands now a documented and previously working function does not do what the documentation says. That's a bug, regardless of how or why it is used.
Yes, I am considering writing a patch to remove that function.

So, again, why are you adding preferences to the default branch at runtime? That is a very unusual thing to do, especially given that default preferences do not persist across a restart, so there is probably a better way for you to do this which can be deployed without disruption.
A recent example of this is bug 1492568, where WebRender ran an experiment turning WebRender off for a portion of Nightly, to gather more information about how it compared to non-WebRender compositors. This study used the pref gfx.webrender.qualified.enabled, which doesn't have a default value in the tree. There are other cases where there are preferences that Firefox reacts to that don't have default values. It's generally allowed and code uses it, and so Normandy supports it to be more flexible.

If you're going to remove the ability to set pref values that don't already have default values, are you going to also make them impossible to read with a fallback, like |getCharPref("foo.bar", "fallback")|? If this change was made first, then Normandy would have no reason to support features that work with prefs like this. Without this change though, there are still features that would require setting preferences like this.

The fact that there are only two places to make changes to the preferences system (one of which is non-persistent, and the other which can't be distinguished from user action) has been a major problem for Normandy. We've gone to great lengths to be able to use the default branch in order to preserve user choice. If you're thinking about large scale changes to the preferences system, it would be a great boon to the experiments program to be able to make persistent changes to the preferences that don't interfere with the user branch.
I don't intend to remove support for creating new default preferences, only for resetting preference branches. I might consider leaving support for clearing a specific default pref, but I'd rather not.

In general, any preference that's accessed in runtime should have a default value, even if the getters use a fallback value. If normandy needs to temporarily change the default value of those that don't in the mean time, it should probably reset them by setting the default value to something reasonable rather than trying to delete the branch.
And, given that the preferred behavior is for all used preferences to have a default value in-tree, if you want to support changing the default value rather than the user value, you need to support restoring the default to some pre-set value, anyway.
> any preference that's accessed in runtime should have a default value

I agree that is a good goal. However, until we enforce it in code, Normandy has to deal with the alternative that sometimes preferences will be accessed that don't exist.

> If normandy needs to temporarily change the default value of those that don't in the mean time, it should probably reset them by setting the default value to something reasonable rather than trying to delete the branch.

The reasonable value to reset the preference to is the value it held before: nothing. Normandy can't soundly assume any better default, as the fallback values across the code base could vary at different call sites, and so any value would be a change of behavior from no value.

> And, given that the preferred behavior is for all used preferences to have a default value in-tree, if you want to support changing the default value rather than the user value, you need to support restoring the default to some pre-set value, anyway.

In the case that the preference had a value before Normandy acted, Normandy resets the pref back to what it was. It only uses deleteBranch in the case that the preference didn't already exist.

I don't actually care about deleteBranch in particular. I don't need an en-masse modification tool. I want Normandy to be able to undo it's own actions when it creates a new default branch preference at run time.
(In reply to Michael Cooper [:mythmon] from comment #21)
> > any preference that's accessed in runtime should have a default value
> 
> I agree that is a good goal. However, until we enforce it in code, Normandy
> has to deal with the alternative that sometimes preferences will be accessed
> that don't exist.

I think it's reasonable for Normandy to require any preference it flips to have an in-tree default value. That's already the norm, and adding friction to the unusual case of failing to add a default is a good thing, as far as I'm concerned.

In the mean time, any preferences Normandy can't reset the defaults for will automatically be reset when the browser restarts.
From the QA perspective, a Normandy fix would mean high risk, since while it needs to fix the issue presently, it also needs to fix the reset values for all the pref-flips and roll-outs launched or finished in the regressing period and it's a proven fact that what QA can test cover a very small fraction of what the wild has in environment variety, so there should be a high concern of any major Normandy changes, especially across all channels. 
Adding to the above, regardless of what we fix, we would still need to uplfit it to beta and release.

Kris, what would be the risks reverting the preference changes that affect Normandy, also taken in account the fact that we need to uplift them across the beta/release channels?
Flags: needinfo?(kmaglione+bmo)
(In reply to Adrian Florinescu [:adrian_sv] from comment #23)
> Kris, what would be the risks reverting the preference changes that affect
> Normandy, also taken in account the fact that we need to uplift them across
> the beta/release channels?

That's a non-starter. This was a major rewrite of most large parts of the preferences service.
Flags: needinfo?(kmaglione+bmo)
See Also: → 1505941
nsIPrefBranch.deleteBranch doesn't work as documented when the preference's
default value was set very early after Firefox has started, such as when
Normandy sets startup branches. This is filed as bug 1505941. In order to work
around this problem, this patch makes Normandy never use deleteBranch, except
in tests where it is safe to do so.

With this patch, an experiment that is run on the default branch for a
preference that does not have a default value in the tree cannot be promptly
unenrolled, instead we must wait until the preference is naturally cleared when
Firefox restarts. This is better than never unenrolling though.
Assignee: eglassercamp → mcooper
Status: NEW → ASSIGNED
Pushed by mcooper@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/bd4297faba36
Don't use deleteBranch in Normandy r=Gijs
See Also: → 1506221
Comment on attachment 9023779 [details]
Bug 1502410 - Don't use deleteBranch in Normandy r=Gijs

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1471025

User impact if declined: Users are effectively never unenrolled from Normandy preference experiments

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: Yes

If yes, steps to reproduce: STR are in comment 0 of this bug

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): The change here is simple and well understood by several people.

String changes made/needed: None
Attachment #9023779 - Flags: approval-mozilla-release?
Attachment #9023779 - Flags: approval-mozilla-beta?
Michael, I had planned to go to build our next dot release, will you have an updated patch by then or should I evaluate delaying our dot release to the end of the week for your patch? Thanks
> will you have an updated patch by then

Sorry by when? I think you may have left something out in that sentence.

That being said, I hope to have this landed by the end of the day. The problems preventing it are some small test problems.
Flags: needinfo?(mcooper)
(In reply to Michael Cooper [:mythmon] from comment #30)
> > will you have an updated patch by then
> 
> Sorry by when? I think you may have left something out in that sentence.
> 
> That being said, I hope to have this landed by the end of the day. The
> problems preventing it are some small test problems.

The plan is to go to build 63.0.3 today.
(In reply to Michael Cooper [:mythmon] from comment #30)
> That being said, I hope to have this landed by the end of the day. The
> problems preventing it are some small test problems.

Michael, we are currently holding up 63.0.3 on this patch, can you give us a status update? Thanks
Flags: needinfo?(mcooper)
:mythmon is sick today, but this patch was held up in review, which was provided a couple hours ago. I can't land it myself but we believe it is suitable for shipping.
Flags: needinfo?(mcooper)
Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/986918003af9
Don't use deleteBranch in Normandy r=Gijs,adw
Keywords: checkin-needed
Comment on attachment 9023779 [details]
Bug 1502410 - Don't use deleteBranch in Normandy r=Gijs

We need to fix this blocking issue on release63 and beta64
Attachment #9023779 - Flags: approval-mozilla-release?
Attachment #9023779 - Flags: approval-mozilla-release+
Attachment #9023779 - Flags: approval-mozilla-beta?
Attachment #9023779 - Flags: approval-mozilla-beta+
https://hg.mozilla.org/mozilla-central/rev/986918003af9
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 65
Flags: qe-verify+
To sumarize the issue on the affected builds:
- any normady preference type recipes (pref-experiments / rollouts / rollbacks) are stuck with normandy values which are not reset to the original value when the recipe expires/graduates/rollsback.

The fix has two verification sides:
1. fix the problem so that normandy sets values for preferences for historical values (recipes that were ended during the affected builds 63/64/65) 
2. fix Normandy functionality in regards to recipe expiration

Basically, the difference in functionality when compared with the version before the regression is that now, with the fix, on preference expiration it is required one restart in order to have the preference values reset.


We have checked the following on Windows 10 / Windows 8.1:

Preference Experiments:
For pref-experiments created for both existent and non-existent preferences scenarios;
For pref-experiments created for both existent and non-existent preferences scenarios on a fixed build;


Nightly:
Affected: 65.0a1 (2018-10-25) (64-bit) 20181025100246  Exp. started and ended.
Verified Fixed: 65.0a1 (2018-11-15) (64-bit)  20181115100051 After first restart with the updated version, the values are correctly reset. 

Beta:
Affected: 64.0b3, 20181022150107  Exp. started and ended
Verified Fixed: 64.0b10; 20181114213203  After first restart with the updated version, the values are correctly reset. 

RC 0.3:
Affected: 63.0.1 20181030165643        Exp. started and ended
Verified Fixed: 63.0.3 	20181114214635       After first restart with the updated version, the values are correctly reset. 


Rollouts/Rollbacks:
Affected: 63.0.1 20181030165643
Verified Fixed: 63.0.3 	20181114214635 

Confirmed for Rollbacks that the preference values are not reset to original values after the rollout is rolledback with a release affected build and verified that the values are reset to the original values after the first restart. The better test here would've been to check the graduation, but that is not possible without a special default value build for the test preferences and QA doesn't feel that not checking the graduation scenario poses high risk.

[Note:] Basic fix check was done on ubuntu and osx platforms since the QA doesn't assess this fix as being a platform sensitive change.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Attachment #9021320 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: