Closed Bug 1023483 Opened 10 years ago Closed 6 years ago

Prefs used in test automation are listed in too many places & are out of sync

Categories

(Testing :: General, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: emorley, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: sheriffing-P1)

Whilst bug 830430 improved the situation, we still have a large number of locations for specifying prefs (eg for "disable updates", "use fake URL for foo, so we don't hit the external network") used during test automation.

This means that every time someone adds a new feature that could interfere with the test run, they have to pref it off in multiple locations, meaning that inevitably some get missed.

These locations include (I've tried to find all of them, but this list may not be complete):

http://mxr.mozilla.org/mozilla-central/source/testing/profiles/prefs_general.js
http://mxr.mozilla.org/mozilla-central/source/testing/profiles/prefs_b2g_unittest.js

http://mxr.mozilla.org/mozilla-central/source/js/src/tests/user.js

http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-preferences.js
http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/runreftestb2g.py#416
http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/b2g_desktop.py#101
http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/remotereftest.py#340

http://mxr.mozilla.org/mozilla-central/source/testing/mozbase/mozprofile/mozprofile/profile.py#312

https://hg.mozilla.org/build/talos/file/tip/talos/PerfConfigurator.py#l243

As a result, we periodically find that the root cause for various intermittent failures is due to one of these being missed, hence the following bugs over the last 12-18 months:

Bug 808824 - Set extensions.blocklist.enabled to false for talos tests
Bug 874049 - add preferences to talos to ignore requests to the wild
Bug 758068 - The Talos suite appears to be downloading and staging updates on test slaves
Bug 999518 - don't hit the production FxA server during mochitests
Bug 995995 - set testing prefs to redirect to the test proxy server for RSS feeds
Bug 997188 - Handle the experiments.manifest.uri pref correctly so so that it don't ping out to telemetry-experiments.cdn.mozilla.net by default or enable experiment code in the tests
Bug 992611 - Disable speculative connections in mochitests so they aren't reported as leaking when they're hanging around
Bug 965358 - Disable android snippets in default testing profile
Bug 997820 - tests randomly connect to incoming.telemetry.mozilla.org
Bug 992324 - disable interruptible reflow in reftest
Bug 840186 - Automation infrastructure should not submit FHR data to production
Bug 994302 - disable speculative connections for talos
Bug 996871 - add preferences to talos to remove external network access for rss
Bug 1022785 - disable android snippets for reftests/crashtests/jsreftests
Bug 1018400 - disable safebrowsing during reftests
Bug 1023450 - Experiments are running during test automation

We should:
1) Short term: Sync up the prefs asap (since I believe things like bug 1023360 are being caused by them).
2) Short term: Add comments to each file pointing to somewhere listing the full set of locations that any changes need to be mirrored to.
3) Longer term: pull more prefs into the central location (testing/profiles/prefs_general.js) and get the remaining harnesses to use it, rather than their own prefs lists.

This will also help with bug 995417/bug 617414.
Blocks: 995417
Depends on: 994302, 996871
I'd like to propose an alternate solution.

Test-specific prefs are just one of a larger class of actions performed to mutate state during test execution. Something else that falls in that bucket includes waiting on services such as FHR to initialize before running tests.

I'd like to propose some magic (perhaps via XPCOM - manifests or not) where components/features can register "hey, I'm interested in doing something special in test mode." When the test harness spins up, it iterates through every party that has expressed an interest in test-specific mode and says "do your thing. I'll wait for you to tell me you are ready." And, when tests have finished executing, we can iterate through again and say "OK, we're done with the tests, restore state."

We could even extend this to per-test callbacks so individual components have a way of resetting important state or verifying non-mutated state after a test executes. For example, we have some tests that disable components like Telemetry Experiments because we don't want that feature fetching from remote URIs during the mochitest browser run. However, there are mochitests for Telemetry Experiments themselves that monkeypatch the remote fetching so we can, you know, test Telemetry Experiments. The existing solution involves lots of temporary state mutation during individual tests. We frequently get things wrong or fail to restore state after e.g. unexpected failure. Having a hook point where components can restore expected state after an individual test should help enforce environment consistency and hopefully reduce the number of intermittent failures.

If we were to move forward with this approach, I can think of the following implementations:

1) Use the category manager to register consumers (function names?) that are interested in certain events.
2) Establish some kind of registrar service that components call into when they are initialized. This registrar services holds a manifest of all interested parties for various events.
3) Use observers somehow. But observers are synchronous. We must support async events. See bug 722648 for an async observer proposal.
(In reply to Ed Morley [:edmorley UTC+0] from comment #0)
> These locations include (I've tried to find all of them, but this list may
> not be complete):

Always one more, right?

http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/runreftest.py#168

For whatever reason, reftest lets you set preferences two places: one in the test scripts, and one in a separate file that gets packaged up in the reftest component.  It's worth noting that the latter is suboptimal for setting "disable this service" preferences, because some services spin up themselves based on a given pref (and pull in URLs from prefs, etc.), and don't respond properly to pref changes after startup.

In an ideal world, of course, we'll set all the relevant preferences in the profile, and this won't be an issue.  But it's something to keep in mind for would-be bug-fixers.
(In reply to Ed Morley [:edmorley UTC+0] from comment #3)
> Similarly, our env variables have to be set in multiple places too:

Also:
http://mxr.mozilla.org/build-central/source/talos/talos/PerfConfigurator.py#216
http://mxr.mozilla.org/build-central/source/talos/talos/ttest.py#273

+ a bunch of places in buildbot-configs/buildbotcustom/mozharness (http://mxr.mozilla.org/build-central/search?string=MOZ_CRASHREPORTER), but they will affect all builds, whereas we need (at least for now) for things like MOZ_DISABLE_NONLOCAL_CONNECTIONS to ride the trains.

In the future perhaps we could define MOZ_DISABLE_NONLOCAL_CONNECTIONS, MOZ_CRASHREPORTER etc based on MOZ_AUTOMATION, to have just one define to rule them all in automation environments?
No longer depends on: 994302
Depends on: 1030093
Depends on: 1030111
(removing deps added to help the 'sync up' part of this bug, since I think they add more noise than they help)
No longer depends on: 1030093, 1030111, 996871
Also in testing/mozbase/mozrunner/mozrunner/base/device.py
Just been asked on IRC - do we have a wiki page where these are listed?
(In reply to Ed Morley [:edmorley] from comment #8)
> Just been asked on IRC - do we have a wiki page where these are listed?

I started listing them here:
https://developer.mozilla.org/en-US/docs/Mozilla/QA/Automated_testing#Need_to_set_preferences_for_test-suites.3F
For talos, while global talos prefs are set via PerfConfigurator.py, test-specific prefs are set peer-test (if required) via test.py, e.g. for the session restore test: https://hg.mozilla.org/build/talos/file/49b74c08dad4/talos/test.py#l209
Mass-closing old bugs I filed that have not had recent activity/no longer affect me.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.