Chaos mode makes test verification unreliable
Categories
(Testing :: General, enhancement, P3)
Tracking
(Not tracked)
People
(Reporter: gbrown, Unassigned)
References
(Blocks 1 open bug)
Details
Attachments
(3 files)
2.99 KB,
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
6.01 KB,
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
47 bytes,
text/x-phabricator-request
|
Details | Review |
Test verification tries to efficiently find intermittent test failures by running just-modified tests repeatedly and in various configurations or environments. The initial implementation includes running tests in chaos mode (MOZ_CHAOS_MODE environment variable set). Initial tests indicate that many more failures occur in chaos mode than in regular mode. I want to investigate those failures and determine if chaos mode is appropriate and practical for test verification.
Reporter | ||
Comment 1•7 years ago
|
||
...but first, as a temporary measure, let's remove chaos mode from test verification, so that we can start using test verification. I'll leave-open for investigation. Hopefully we can restore this code soon.
Comment 2•7 years ago
|
||
Comment on attachment 8897860 [details] [diff] [review] remove chaos mode support from test verification Review of attachment 8897860 [details] [diff] [review]: ----------------------------------------------------------------- ok, it was a good idea.
Reporter | ||
Updated•7 years ago
|
Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/3539f73f0f04 Do not use chaos mode for test verification; r=jmaher
Comment 4•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/3539f73f0f04
Reporter | ||
Comment 5•7 years ago
|
||
The main issue is seen here: https://public-artifacts.taskcluster.net/Si-ZnmjIRQCW5cNsu5lRDw/0/public/logs/live_backing.log [task 2017-09-25T14:57:32.092Z] 14:57:32 INFO - TEST-INFO | started process GECKO(12284) [task 2017-09-25T14:57:32.136Z] 14:57:32 INFO - GECKO(12284) | *** You are running in chaos test mode. See ChaosMode.h. *** [task 2017-09-25T14:57:33.289Z] 14:57:33 INFO - GECKO(12284) | 1506351453285 Marionette INFO Enabled via --marionette [task 2017-09-25T14:57:35.114Z] 14:57:35 INFO - GECKO(12284) | 1506351455109 Marionette INFO Listening on port 2828 [task 2017-09-25T14:57:35.373Z] 14:57:35 INFO - GECKO(12284) | 1506351455366 Marionette DEBUG Register listener.js for window 2147483649 [task 2017-09-25T14:57:35.750Z] 14:57:35 INFO - Traceback (most recent call last): [task 2017-09-25T14:57:35.750Z] 14:57:35 INFO - File "/builds/worker/workspace/build/tests/mochitest/runtests.py", line 2660, in doTests [task 2017-09-25T14:57:35.750Z] 14:57:35 INFO - marionette_args=marionette_args, [task 2017-09-25T14:57:35.751Z] 14:57:35 INFO - File "/builds/worker/workspace/build/tests/mochitest/runtests.py", line 2164, in runApp [task 2017-09-25T14:57:35.751Z] 14:57:35 INFO - addons.install(create_zip(self.mochijar)) [task 2017-09-25T14:57:35.752Z] 14:57:35 INFO - File "/builds/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/addons.py", line 52, in install [task 2017-09-25T14:57:35.752Z] 14:57:35 INFO - raise AddonInstallException(e) [task 2017-09-25T14:57:35.753Z] 14:57:35 INFO - AddonInstallException: Could not install add-on at '/tmp/tmpxnNgyy.zip': UnknownError: ERROR_FILE_ACCESS: There was an error accessing the filesystem. [task 2017-09-25T14:57:35.753Z] 14:57:35 INFO - stacktrace: [task 2017-09-25T14:57:35.755Z] 14:57:35 INFO - WebDriverError@chrome://marionette/content/error.js:239:5 [task 2017-09-25T14:57:35.756Z] 14:57:35 INFO - UnknownError@chrome://marionette/content/error.js:537:5 [task 2017-09-25T14:57:35.757Z] 14:57:35 INFO - addon.install@chrome://marionette/content/addon.js:101:11 [task 2017-09-25T14:57:35.758Z] 14:57:35 INFO - async*GeckoDriver.prototype.installAddon@chrome://marionette/content/driver.js:3326:10 [task 2017-09-25T14:57:35.759Z] 14:57:35 INFO - despatch@chrome://marionette/content/server.js:555:20 [task 2017-09-25T14:57:35.760Z] 14:57:35 INFO - async*execute@chrome://marionette/content/server.js:529:11 [task 2017-09-25T14:57:35.761Z] 14:57:35 INFO - async*onPacket/<@chrome://marionette/content/server.js:504:15 [task 2017-09-25T14:57:35.765Z] 14:57:35 INFO - async*onPacket@chrome://marionette/content/server.js:503:8 [task 2017-09-25T14:57:35.767Z] 14:57:35 INFO - _onJSONObjectReady/<@chrome://marionette/content/transport.js:501:9 [task 2017-09-25T14:57:35.767Z] 14:57:35 ERROR - Automation Error: Received unexpected exception while running application [task 2017-09-25T14:57:35.771Z] 14:57:35 ERROR - [task 2017-09-25T14:57:35.772Z] 14:57:35 INFO - Stopping web server [task 2017-09-25T14:57:35.773Z] 14:57:35 INFO - GECKO(12284) | 1506351455742 addons.xpi WARN Failed to install /tmp/tmpxnNgyy.zip from file:///tmp/tmpxnNgyy.zip to /tmp/tmpfIuYKv.mozrunner/extensions/staged/mochikit@mozilla.org.xpi: Unix error 4 during operation pump (Interrupted system call) ((unknown module)) No traceback available [task 2017-09-25T14:57:35.774Z] 14:57:35 INFO - Stopping web socket server
Reporter | ||
Comment 6•7 years ago
|
||
Test chaos mode has a variety of features -- see ChaosMode.h. It seems like test verification remains reliable if only some features are enabled: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2451d0577730036c61e1e70750546346da50a055&filter-tier=1&filter-tier=2&filter-tier=3 I'd like to land this, go to tier 2, then circle back here another day to figure out the issues and expand chaos mode support.
Comment 7•7 years ago
|
||
Comment on attachment 8912348 [details] [diff] [review] add back limited (3) chaos mode steps Review of attachment 8912348 [details] [diff] [review]: ----------------------------------------------------------------- very cool
Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/87291aa18bf0 Enable limited test chaos mode in test-verify; r=jmaher
Comment 9•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/87291aa18bf0
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Updated•6 years ago
|
Comment 10•6 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months. :gbrown, maybe it's time to close this bug?
Comment 12•5 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:gbrown, maybe it's time to close this bug?
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 14•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=679135ad6119f92398515e26f5b936fea93a1ab8
With ChaosFeature TimerScheduling, I see reftest crashes on shutdown:
[task 2020-02-19T23:23:07.007Z] 23:23:07 INFO - Assertion failure: rc != 0 (destroyed timer off its target thread!), at /builds/worker/workspace/build/src/xpcom/threads/TimerThread.cpp:443
[task 2020-02-19T23:23:22.728Z] 23:23:22 INFO - #01: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:1220]
[task 2020-02-19T23:23:22.728Z] 23:23:22 INFO -
[task 2020-02-19T23:23:22.729Z] 23:23:22 INFO - #02: NS_ProcessNextEvent(nsIThread*, bool) [xpcom/threads/nsThreadUtils.cpp:481]
[task 2020-02-19T23:23:22.729Z] 23:23:22 INFO -
[task 2020-02-19T23:23:22.730Z] 23:23:22 INFO - #03: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:303]
[task 2020-02-19T23:23:22.730Z] 23:23:22 INFO -
[task 2020-02-19T23:23:22.731Z] 23:23:22 INFO - #04: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:315]
[task 2020-02-19T23:23:22.731Z] 23:23:22 INFO -
[task 2020-02-19T23:23:22.732Z] 23:23:22 INFO - #05: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:291]
[task 2020-02-19T23:23:22.732Z] 23:23:22 INFO -
[task 2020-02-19T23:23:22.733Z] 23:23:22 INFO - #06: nsThread::ThreadFunc(void*) [xpcom/threads/nsThread.cpp:466]
[task 2020-02-19T23:23:22.733Z] 23:23:22 INFO -
[task 2020-02-19T23:23:22.858Z] 23:23:22 INFO - #07: _pt_root [nsprpub/pr/src/pthreads/ptthread.c:204]
[task 2020-02-19T23:23:22.858Z] 23:23:22 INFO -
[task 2020-02-19T23:23:22.859Z] 23:23:22 INFO - #08: libpthread.so.0 + 0x76db
[task 2020-02-19T23:23:22.859Z] 23:23:22 INFO -
[task 2020-02-19T23:23:22.859Z] 23:23:22 INFO - #09: libc.so.6 + 0x12188f
[task 2020-02-19T23:23:22.859Z] 23:23:22 INFO -
[task 2020-02-19T23:23:22.860Z] 23:23:22 INFO - #10: ??? (???:???)
Reporter | ||
Comment 15•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=83d04c3f961887d9bf79918a0c0b716656f95b46
With TimerScheduling disabled but all other modes enabled, no failures, for this limited test.
Reporter | ||
Comment 16•4 years ago
|
||
Now, with all chaos modes enabled I see TimerThread assertions, especially in Windows mochitests:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d0fc96086d619fbd100cd2ca4ecc08f2ee446555
With all but TimerScheduling enabled (0xfb), all is well:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0398f96ae62876627cf5eec4512b8c959725a0bd
Reporter | ||
Comment 17•4 years ago
|
||
Updated•4 years ago
|
Comment 18•4 years ago
|
||
Pushed by whole.grains@protonmail.com: https://hg.mozilla.org/integration/autoland/rev/1c6aa40b84a2 Enable all test-verify chaos modes except TimerScheduling; r=jmaher
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/25368 for changes under testing/web-platform/tests
Reporter | ||
Updated•4 years ago
|
Comment 20•4 years ago
|
||
bugherder |
Upstream PR merged by moz-wptsync-bot
Reporter | ||
Comment 22•4 years ago
|
||
Reviewing recent TV* runs on autoland, I think all is working as expected: Most TV runs pass; most TV failures occur in the first, non-chaos steps; TV failures during chaos mode steps seem reasonable.
Still leaving open for follow-up on TimerScheduling.
Updated•3 years ago
|
Reporter | ||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•1 year ago
|
Description
•