Open Bug 1323779 Opened 8 years ago Updated 2 years ago

flushPrefEnv (called at end of mochitest) hangs indefinitely if a pref env restore would restore a pref to a value it already has

Categories

(Testing :: Mochitest, defect)

defect

Tracking

(Not tracked)

People

(Reporter: Gijs, Unassigned)

Details

Eh, I was confused which pref this was about. Updated STR:

1. Run a test that looks like this:

yield SpecialPowers.pushPrefEnv({set: [["foo.bar", "somestr"]]});
Services.prefs.clearUserPref("foo.bar");
yield SpecialPowers.pushPrefEnv({set: [["foo.bar", "somestr"]]});
Services.prefs.clearUserPref("foo.bar");

where "foo.bar" doesn't normally exist


ER:
the test completes and the test framework shuts down afterwards

AR:
the test hangs indefinitely after finishing. Debugging shows we get stuck in browser-test's nextTest function, specifically trying to do this:

yield new Promise(resolve => SpecialPowers.flushPrefEnv(resolve));
Iirc, the behavior of pushPrefEnv is not defined when you mix it with the old style pref setting.
Flags: needinfo?(jmaher)
(In reply to Martijn Wargers [:mwargers] from comment #2)
> Iirc, the behavior of pushPrefEnv is not defined when you mix it with the
> old style pref setting.

The code *under test* is the thing that's using "old style" pref setting, so it can't use pushPrefEnv. The example I gave is just simplified so it's easy to test locally.
pushPrefEnv expects all calls to go through it, otherwise we lose out on things.  While it does listen for changes, it doesn't know the previous state of the pref so it doesn't have a stack to pop off the values during the cleanup/reset stage.

Is this something we want to support?  Can we not modify the test cases to use pushPrefEnv?
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher) from comment #4)
> pushPrefEnv expects all calls to go through it, otherwise we lose out on
> things.  While it does listen for changes, it doesn't know the previous
> state of the pref so it doesn't have a stack to pop off the values during
> the cleanup/reset stage.
> 
> Is this something we want to support?  Can we not modify the test cases to
> use pushPrefEnv?

As I already said in comment #3, it's not the testcase's fault. The code under test (so the actual jsm we're testing) needs pref X to be set, and then does stuff, and then *that (production) code* resets pref X. So we set it again the next time we want to test that code, and so forth. The production code could never use pushPrefEnv - it isn't test code.

I don't understand why this is hard to cope with, to the point that the pushPrefEnv implementation just hangs the test runner. Where does the implementation live?
Flags: needinfo?(jmaher)
It's hard to cope with because prefs are not synchronous with e10s enabled. The code is here:
https://dxr.mozilla.org/mozilla-central/rev/6dbc6e9f62a705d5f523cc750811bd01c8275ec6/testing/specialpowers/content/specialpowersAPI.js#1022

It waits for a notification from the prefservice that the pref value changed, which is assuredly what the problem is here. That code exists because otherwise in e10s tests the test body will race with setting the pref. :-/
Yes, there are 2 reasons to use pushPrefEnv, afaik:
1 - provide an asynchronous method to set a pref, so to make sure the pref is set in e10s.
2 - that the pref is reset to the previous state after the test has been run

When you're mixing Services.prefs.clearUserPref with pushPrefEnv, you're defeating both reasons to use pushPrefEnv.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #6)
> It's hard to cope with because prefs are not synchronous with e10s enabled.
> The code is here:
> https://dxr.mozilla.org/mozilla-central/rev/
> 6dbc6e9f62a705d5f523cc750811bd01c8275ec6/testing/specialpowers/content/
> specialpowersAPI.js#1022
> 
> It waits for a notification from the prefservice that the pref value
> changed, which is assuredly what the problem is here. That code exists
> because otherwise in e10s tests the test body will race with setting the
> pref. :-/

I'm in a browser test. All the relevant code runs in the parent process. The yields for the pushPrefEnv return fine. It's only when mochitest cleans up that things go pearshapes. I don't think there's any possibility for test body races here, just broken cleanup code somewhere.

(In reply to Martijn Wargers [:mwargers] from comment #7)
> Yes, there are 2 reasons to use pushPrefEnv, afaik:
> 1 - provide an asynchronous method to set a pref, so to make sure the pref
> is set in e10s.
> 2 - that the pref is reset to the previous state after the test has been run
> 
> When you're mixing Services.prefs.clearUserPref with pushPrefEnv, you're
> defeating both reasons to use pushPrefEnv.

Not really. If the code is broken, I want the pref to reset at the end of the test (to avoid breaking future tests), which is item (2) in your list. It's only because the code is currently not broken that the code under test is changing the pref.
Specifically, if my reading of https://dxr.mozilla.org/mozilla-central/rev/6dbc6e9f62a705d5f523cc750811bd01c8275ec6/testing/specialpowers/content/specialpowersAPI.js#1153-1172 is correct, the callback never gets called if the pref is already the desired value. We avoid this case for pushPrefEnv ( https://dxr.mozilla.org/mozilla-central/rev/6dbc6e9f62a705d5f523cc750811bd01c8275ec6/testing/specialpowers/content/specialpowersAPI.js#1073-1074 ) but that doesn't help for the restoring of state at the end. There should be a similar check for the state restore.
Summary: pushPrefEnv gets confused if someone else changes your pref and hangs the test framework on finish → flushPrefEnv (called at end of mochitest) hangs indefinitely if a pref env restore would restore a pref to a value it already has
(In reply to :Gijs Kruitbosch from comment #8)
> Not really. If the code is broken, I want the pref to reset at the end of
> the test (to avoid breaking future tests), which is item (2) in your list.

Which "the code"? The test code? That is specifically what item (2) is about, to reset after the test has run, to avoid breaking future tests.
(In reply to Martijn Wargers [:mwargers] from comment #10)
> (In reply to :Gijs Kruitbosch from comment #8)
> > Not really. If the code is broken, I want the pref to reset at the end of
> > the test (to avoid breaking future tests), which is item (2) in your list.
> 
> Which "the code"? The test code? That is specifically what item (2) is
> about, to reset after the test has run, to avoid breaking future tests.

No, the code under test - Firefox code. Right now, that code uses clearUserPref. In the future, maybe it won't. The test should continue to work regardless.
I am glad to see this bug and learn more.  I think the title change is something more actionable we should resolve sooner than later, and the concept of tracking pref changes the browser does when the test does actions might be interesting to track or notify via the log file, but maybe something longer term.

From what I understand we are calling clearUserPref() in the flushPrefEnv() cleanup and this is hanging.  I am not sure why this would happen unless there is no userpref set- maybe prior to calling clearUserPref we check if the pref exists?

Another bug is that we do listen for pref change events and if we are getting an event from the browser itself doing this action, we might miss our event from the pushPrefEnv call.  Possibly we can catch changes here:
https://dxr.mozilla.org/mozilla-central/rev/6dbc6e9f62a705d5f523cc750811bd01c8275ec6/testing/specialpowers/content/specialpowersAPI.js#1154

I am thinking we could validate the proper type/value is set for the given pref and if it is different we would know that something else set a pref (or a nasty bug was hit).  Again, logging this information would go a long way towards determining when this happens- I would argue we should fail the test so we can investigate this in more details as the browser state is changing.
Flags: needinfo?(jmaher)
AIUI the pref modification that flushPrefEnv causes (in this case, a clearUserPref call) works fine, but is a no-op. But because flushPrefEnv *also* waits for the pref modification to take effect, and doesn't hear anything (because it was a no-op), it waits indefinitely and thus 'hangs' the test framework (the promise flushPrefEnv returns never resolves, and if you pass it a callback that never fires).

Note that this is not entirely trivial to fix because right now the code assumes that given, say, 5 prefs it needs to revert to their original value, it will just do that for the first 4 without waiting for any changes, and then waits for changes for the 5th one, assuming that by then, the first 4 changes will for sure have happened (which would be reasonable if it wasn't for this bug). So we can't *only* make it not wait for changes if a pref change is a no-op, because then we wouldn't wait at all, which will lead to a race with the other prefs being reset vs. when we start the next test (although that's a race that we're unlikely to lose given there's probably other IPC traffic that will need to happen before the next test runs).
Though, also note that even if we don't have a callback, applyPrefs registers a pref service observer, which might not fire and thus not be removed, and thereby be leaking the world until the pref does end up changing. :-\
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.