Closed Bug 1775119 Opened 3 years ago Closed 3 years ago

Frequent TEST-UNEXPECTED-TIMEOUT | toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js | Test timed out

Tracking

(firefox-esr91 unaffected, firefox-esr102 unaffected, firefox101 unaffected, firefox102 unaffected, firefox103 fixed)

Status:

RESOLVED FIXED

Milestone:

103 Branch

Tracking Flags:

Tracking

Status

firefox-esr91

---

unaffected

firefox-esr102

---

unaffected

firefox101

---

unaffected

firefox102

---

unaffected

firefox103

---

fixed

People

(Reporter: intermittent-bug-filer, Assigned: rpl)

References

(Regression)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [retriggered][stockwell needswork:owner])

Attachments

(2 files)

Bug 1775119 - Tweak test_background_page_injects_scripts_inline to fail fast on unexpected csp violation. r?willdurand! 3 years ago Luca Greco [:rpl] [:luca] [:lgreco] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1775119 - Make sure NSS is initialized before calling NS_NewCryptoHash from nsCSPHashSrc::allows. r?barret,ckerschb! 3 years ago Luca Greco [:rpl] [:luca] [:lgreco] 48 bytes, text/x-phabricator-request		Details \| Review

Treeherder Bug Filer

Reporter

Description

•

3 years ago

treeherder

Filed by: ncsoregi [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=381918471&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/a-NJtAw3Tju0oLo08OA9gw/runs/0/artifacts/public/logs/live_backing.log

[task 2022-06-20T15:43:54.667Z] 15:43:54     INFO -  TEST-START | xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js
[task 2022-06-20T15:48:54.725Z] 15:48:54  WARNING -  TEST-UNEXPECTED-TIMEOUT | xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js | Test timed out
[task 2022-06-20T15:48:54.725Z] 15:48:54     INFO -  TEST-INFO took 300057ms
[task 2022-06-20T15:48:54.725Z] 15:48:54     INFO -  >>>>>>>
[task 2022-06-20T15:48:54.725Z] 15:48:54     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2022-06-20T15:48:54.726Z] 15:48:54     INFO -  (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2022-06-20T15:48:54.726Z] 15:48:54     INFO -  (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2022-06-20T15:48:54.726Z] 15:48:54     INFO -  running event loop
[task 2022-06-20T15:48:54.727Z] 15:48:54     INFO -  xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js | Starting setup check_remote
[task 2022-06-20T15:48:54.727Z] 15:48:54     INFO -  (xpcshell/head.js) | test check_remote pending (2)
[task 2022-06-20T15:48:54.728Z] 15:48:54     INFO -  TEST-PASS | xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_script_filenames.js | check_remote - [check_remote : 1] useRemoteWebExtensions matches - true == true

Cosmin Sabou [:CosminS]

Comment 1

•

3 years ago

Retriggers here.

Flags: needinfo?(brennie)

Regressed by: 1773182

Beth Rennie [:beth] (she/her)

Comment 2

•

3 years ago

This is not caused by bug 1773182. That bug added some new C++ helper functions which would not result in these failures.

Flags: needinfo?(brennie)

Beth Rennie [:beth] (she/her)

Updated

•

3 years ago

No longer regressed by: 1773182

Comment hidden (Intermittent Failures Robot)

Andreea Pavel [:apavel]

Comment 4

•

3 years ago

Hi Shane, could you please assign someone to take a look at this?

There are 34 total failures in the last 2 days on

windows7-32-qr opt
windows10-64-2004-qr debug and opt
windows10-64-2004-ccov-qr opt
windows10-32-2004-qr debug
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=382110705&repo=mozilla-central&lineNumber=1896

Flags: needinfo?(mixedpuppy)

Whiteboard: [retriggered][stockwell needswork:owner]

Comment hidden (Intermittent Failures Robot)

Luca Greco [:rpl] [:luca] [:lgreco]

Assignee

Comment 7

•

3 years ago

I've started to look into this today, I noticed that:

I was not able to see it failing intermittently if the test was restricted to only the "in-process-webextensions" mode (mach try --tag in-process-webextensions --verify path/to/the/testfile)
but it does fail pretty often (on the windows box where I was testing it it did fail after 3 runs of the --verify execution at most) with the "remote-webextensions" mode (mach try --tag remote-webextensions --verify path/to/the/testfile)

I was tweaking the test a bit for making it fail fast when the failure is being triggered, I have been also able to trigger it on linux both optimized and debug build, and I did verify that:

when the failure is triggered it is because the inline script is intermittently failing to pass the csp hash check and it is blocked because identified as a csp policy violation (in my tweaked version I added a listener to the "securitypolicyviolation" event and that listener was logging a violation for the inline script consistently)

When the failure is triggered in a debug build I also noticed that the following assertion error is being logged:

[Child 2367268, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, false) failed with result 0x80070057 (NS_ERROR_ILLEGAL_VALUE): file /builds/worker/checkouts/gecko/dom/security/nsCSPUtils.cpp:991

right before the csp violation, and that points to the assertion here in nsCSPHashSrc::allows.

This seems to be pretty consistent too and so it seems worth digging into it a bit more.

Flags: needinfo?(mixedpuppy)

Luca Greco [:rpl] [:luca] [:lgreco]

Assignee

Comment 8

•

3 years ago

Hi Barret,
do you have some idea about what may be triggering that NS_ERROR_ILLEGAL_VALUE intermittently from that NS_NewCryptoHash call? (see comment 7)

Now that I've been able to trigger it on Linux I may also be able to record the failure with rr and dig into it a bit more, but in the meantime I wanted to ask you if you have some ideas (even just of some things that I could check in an rr recording to pin-point what is triggering this intermittently).

Flags: needinfo?(brennie)

Beth Rennie [:beth] (she/her)

Comment 9

•

3 years ago

I was puzzled by this at first but it turns out NS_ERROR_ILLEGAL_VALUE == NS_ERROR_INVALID_ARG

NS_NewCryptoHash(const nsCString&, nsICryptoHash**) returns NS_ERROR_INVALID_ARG if either (a) an invalid algorithm is provided to nsCryptoHash::InitWithString or (b) the call to HASH_Create in nsCryptoHash::Init() returns nullptr. The latter case is due to NSS failures, I'm not an expert there.

Flags: needinfo?(brennie)

Beth Rennie [:beth] (she/her)

Updated

•

3 years ago

Regressed by: 1773797

BugBot [:suhaib / :marco/ :calixte]

Comment 10

•

3 years ago

Set release status flags based on info from the regressing bug 1773797

status-firefox101: --- → unaffected

status-firefox102: --- → unaffected

status-firefox103: --- → affected

status-firefox-esr91: --- → unaffected

Comment hidden (Intermittent Failures Robot)

Luca Greco [:rpl] [:luca] [:lgreco]

Assignee

Comment 12

•

3 years ago

Attached file Bug 1775119 - Tweak test_background_page_injects_scripts_inline to fail fast on unexpected csp violation. r?willdurand! — Details

This patch doesn't fix the intermittent yet (the other patch attached applied the changes needed to prevent
nsCSPHashSrc::allows from failing intermittently when NSS was not initialized by the time we hit the CSP hash src
checks) but it does tweak the test a bit to:

make it easier to change the test inline script without having to manually recomputing its sha256
(which will make it easier to tweak the line script when needed, even just temporary changes applied
for investigating an unexpected failure like for this bug).
make sure the test fails fast if the inline script gets blocked by an unexpected csp violation (like the one
tracked by this issue), by triggering an explicit failure and end the test immediately when the
"securitypolicyviolation" event listener is hit.

Phabricator Automation

Updated

•

3 years ago

Assignee: nobody → lgreco

Status: NEW → ASSIGNED

Luca Greco [:rpl] [:luca] [:lgreco]

Assignee

Comment 13

•

3 years ago

Attached file Bug 1775119 - Make sure NSS is initialized before calling NS_NewCryptoHash from nsCSPHashSrc::allows. r?barret,ckerschb! — Details

The intermittent failure tracked by Bug 1775119 is being triggered when nsCSPHashSrc::allows is called while NSS isn't
initialized yet (which is what makes NS_NewCryptoHash to return a NS_ERROR_ILLEGAL_VALUE nsresult).

After recording the test passing successfully with rr and looked into what did actually initialize NSS when the test pass,
it seems that the NSS initialization (at least in the xpcshell test environment) happened just as a side effect of a call
to Performance::Now() triggered by a call to Document::HasRecentlyStartedForegroundLoads().
This call is originated by the TaskController and so it may not be always triggered before the call to ns CSPHashSrc::allows
and when that happens the inline script triggers an unexpected CSP violation and it is not executed.

This also explains why the test was never failing in the in-process-webextensions, when the extensions are running on the
parent browser process NSS is likely already initialized by some other Gecko internals.

This patch adds an explicit call to EnsureNSSInitializedChromeOrContent right before the call to NS_NewCryptoHash, which is
making sure NSS is always initialized before trying to create a new crypto hash instance (which then makes the test to
pass consistently, included when executed with --verify).

Depends on D150241

Beth Rennie [:beth] (she/her)

Updated

•

3 years ago

Blocks: 1776442

Pulsebot

Comment 14

•

3 years ago

Pushed by luca.greco@alcacoop.it: https://hg.mozilla.org/integration/autoland/rev/030a01092ee3 Tweak test_background_page_injects_scripts_inline to fail fast on unexpected csp violation. r=willdurand https://hg.mozilla.org/integration/autoland/rev/b14241477254 Make sure NSS is initialized before calling NS_NewCryptoHash from nsCSPHashSrc::allows. r=ckerschb

Natalia Csoregi [:nataliaCs]

Comment 15

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/030a01092ee3
https://hg.mozilla.org/mozilla-central/rev/b14241477254

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

status-firefox103: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → 103 Branch

BugBot [:suhaib / :marco/ :calixte]

Updated

•

3 years ago

Keywords: regression

Comment hidden (Intermittent Failures Robot)

BugBot [:suhaib / :marco/ :calixte]

Comment 17

•

3 years ago

Set release status flags based on info from the regressing bug 1773797

status-firefox-esr102: --- → unaffected

You need to log in before you can comment on or make changes to this bug.