Frequent TEST-UNEXPECTED-TIMEOUT | toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js | Test timed out
Categories
(WebExtensions :: General, defect, P5)
Tracking
(firefox-esr91 unaffected, firefox-esr102 unaffected, firefox101 unaffected, firefox102 unaffected, firefox103 fixed)
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox-esr102 | --- | unaffected |
firefox101 | --- | unaffected |
firefox102 | --- | unaffected |
firefox103 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: rpl)
References
(Regression)
Details
(Keywords: intermittent-failure, regression, Whiteboard: [retriggered][stockwell needswork:owner])
Attachments
(2 files)
Filed by: ncsoregi [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=381918471&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/a-NJtAw3Tju0oLo08OA9gw/runs/0/artifacts/public/logs/live_backing.log
[task 2022-06-20T15:43:54.667Z] 15:43:54 INFO - TEST-START | xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js
[task 2022-06-20T15:48:54.725Z] 15:48:54 WARNING - TEST-UNEXPECTED-TIMEOUT | xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js | Test timed out
[task 2022-06-20T15:48:54.725Z] 15:48:54 INFO - TEST-INFO took 300057ms
[task 2022-06-20T15:48:54.725Z] 15:48:54 INFO - >>>>>>>
[task 2022-06-20T15:48:54.725Z] 15:48:54 INFO - (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2022-06-20T15:48:54.726Z] 15:48:54 INFO - (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2022-06-20T15:48:54.726Z] 15:48:54 INFO - (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2022-06-20T15:48:54.726Z] 15:48:54 INFO - running event loop
[task 2022-06-20T15:48:54.727Z] 15:48:54 INFO - xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js | Starting setup check_remote
[task 2022-06-20T15:48:54.727Z] 15:48:54 INFO - (xpcshell/head.js) | test check_remote pending (2)
[task 2022-06-20T15:48:54.728Z] 15:48:54 INFO - TEST-PASS | xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js | check_remote - [check_remote : 1] useRemoteWebExtensions matches - true == true
Comment 2•3 years ago
|
||
This is not caused by bug 1773182. That bug added some new C++ helper functions which would not result in these failures.
Comment hidden (Intermittent Failures Robot) |
Comment 4•3 years ago
|
||
Hi Shane, could you please assign someone to take a look at this?
There are 34 total failures in the last 2 days on
- windows7-32-qr opt
- windows10-64-2004-qr debug and opt
- windows10-64-2004-ccov-qr opt
- windows10-32-2004-qr debug
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=382110705&repo=mozilla-central&lineNumber=1896
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 7•3 years ago
|
||
I've started to look into this today, I noticed that:
-
I was not able to see it failing intermittently if the test was restricted to only the "in-process-webextensions" mode (
mach try --tag in-process-webextensions --verify path/to/the/testfile
) -
but it does fail pretty often (on the windows box where I was testing it it did fail after 3 runs of the --verify execution at most) with the "remote-webextensions" mode (
mach try --tag remote-webextensions --verify path/to/the/testfile
)
I was tweaking the test a bit for making it fail fast when the failure is being triggered, I have been also able to trigger it on linux both optimized and debug build, and I did verify that:
- when the failure is triggered it is because the inline script is intermittently failing to pass the csp hash check and it is blocked because identified as a csp policy violation (in my tweaked version I added a listener to the "securitypolicyviolation" event and that listener was logging a violation for the inline script consistently)
When the failure is triggered in a debug build I also noticed that the following assertion error is being logged:
[Child 2367268, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, false) failed with result 0x80070057 (NS_ERROR_ILLEGAL_VALUE): file /builds/worker/checkouts/gecko/dom/security/nsCSPUtils.cpp:991
right before the csp violation, and that points to the assertion here in nsCSPHashSrc::allows.
This seems to be pretty consistent too and so it seems worth digging into it a bit more.
Assignee | ||
Comment 8•3 years ago
|
||
Hi Barret,
do you have some idea about what may be triggering that NS_ERROR_ILLEGAL_VALUE intermittently from that NS_NewCryptoHash call? (see comment 7)
Now that I've been able to trigger it on Linux I may also be able to record the failure with rr and dig into it a bit more, but in the meantime I wanted to ask you if you have some ideas (even just of some things that I could check in an rr recording to pin-point what is triggering this intermittently).
Comment 9•3 years ago
|
||
I was puzzled by this at first but it turns out NS_ERROR_ILLEGAL_VALUE == NS_ERROR_INVALID_ARG
NS_NewCryptoHash(const nsCString&, nsICryptoHash**) returns NS_ERROR_INVALID_ARG if either (a) an invalid algorithm is provided to nsCryptoHash::InitWithString or (b) the call to HASH_Create in nsCryptoHash::Init() returns nullptr. The latter case is due to NSS failures, I'm not an expert there.
Comment 10•3 years ago
|
||
Set release status flags based on info from the regressing bug 1773797
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 12•3 years ago
|
||
This patch doesn't fix the intermittent yet (the other patch attached applied the changes needed to prevent
nsCSPHashSrc::allows from failing intermittently when NSS was not initialized by the time we hit the CSP hash src
checks) but it does tweak the test a bit to:
-
make it easier to change the test inline script without having to manually recomputing its sha256
(which will make it easier to tweak the line script when needed, even just temporary changes applied
for investigating an unexpected failure like for this bug). -
make sure the test fails fast if the inline script gets blocked by an unexpected csp violation (like the one
tracked by this issue), by triggering an explicit failure and end the test immediately when the
"securitypolicyviolation" event listener is hit.
Updated•3 years ago
|
Assignee | ||
Comment 13•3 years ago
|
||
The intermittent failure tracked by Bug 1775119 is being triggered when nsCSPHashSrc::allows is called while NSS isn't
initialized yet (which is what makes NS_NewCryptoHash to return a NS_ERROR_ILLEGAL_VALUE nsresult).
After recording the test passing successfully with rr and looked into what did actually initialize NSS when the test pass,
it seems that the NSS initialization (at least in the xpcshell test environment) happened just as a side effect of a call
to Performance::Now()
triggered by a call to Document::HasRecentlyStartedForegroundLoads()
.
This call is originated by the TaskController and so it may not be always triggered before the call to ns CSPHashSrc::allows
and when that happens the inline script triggers an unexpected CSP violation and it is not executed.
This also explains why the test was never failing in the in-process-webextensions, when the extensions are running on the
parent browser process NSS is likely already initialized by some other Gecko internals.
This patch adds an explicit call to EnsureNSSInitializedChromeOrContent right before the call to NS_NewCryptoHash, which is
making sure NSS is always initialized before trying to create a new crypto hash instance (which then makes the test to
pass consistently, included when executed with --verify).
Depends on D150241
Comment 14•3 years ago
|
||
Comment 15•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/030a01092ee3
https://hg.mozilla.org/mozilla-central/rev/b14241477254
Updated•3 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment 17•3 years ago
|
||
Set release status flags based on info from the regressing bug 1773797
Description
•