test_ThirdPartyModulesPing.js perma-orange on (Windows) artifact builds
Categories
(Toolkit :: Telemetry, defect, P1)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr128 | --- | unaffected |
| firefox-esr140 | --- | unaffected |
| firefox140 | --- | unaffected |
| firefox141 | --- | fixed |
| firefox142 | --- | fixed |
People
(Reporter: Gijs, Assigned: chutten)
References
(Regression)
Details
(Keywords: regression)
Attachments
(2 files)
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
Probably a regression from bug 1963853 because I didn't see this a week or two ago and that bug is a pretty recent change to the test and (by the sounds of the commit message) the implementation.
:chutten... I hate to ask, but, any idea what's up here?
| Reporter | ||
Comment 1•10 months ago
|
||
(The test only runs on Windows, btw, so it's orange everywhere it runs...)
Failure message isn't terribly helpful:
TEST-UNEXPECTED-FAIL | toolkit/components/telemetry/tests/unit/test_ThirdPartyModulesPing.js | xpcshell return code: 0
structure log link to the later-dumped full log which is maybe more helpful? https://treeherder.mozilla.org/logviewer?job_id=514779775&repo=try&lineNumber=4205
Comment 2•10 months ago
|
||
Set release status flags based on info from the regressing bug 1963853
| Assignee | ||
Comment 3•10 months ago
|
||
Well that's not good:
[task 2025-06-25T15:39:02.539+00:00] 15:39:02 INFO - Unexpected exception Waiting for Glean ping - timed out after 50 tries.
Lemme give this a go locally and see what I can turn up.
| Assignee | ||
Comment 4•10 months ago
|
||
Oh. Oh no. I've committed one of the cardinal sins of JOG: testing using JS something instrumented in C++.
Potentially-Unnecessary Details
C++ and JS use the same infrastructure for testBeforeNextSubmit and testSubmission -- when a ping is about to be submitted in either language, a central registry is queried to see if there's any test callbacks to run first. This registry is keyed by the ping's id. This is how this test works in full builds despite the callback being registered in JS and the ping being submitted in C++.
When JOG registers new pings to ensure JS has up-to-date definitions to operate on, it supplies those new pings with new, non-overlapping ids. (This is to avoid the same id referencing two different ping instances.) This means that the C++ code is submitting "third-party-modules" with id, I dunno, let's say 35, but our JS test registered its callback on GleanPings.thirdPartyModules only after JOG logs: 0:08.76 pid:34720 2025-06-26 20:41:04.180000 UTC - [Parent 34720: Main Thread]: V/jog Registering ping third-party-modules id 32803.
The ids no longer match, so no registered callback can be triggered.
The solution I've chosen is to skip this test on artifact builds and document in the JOG docs this particular failure case. I've filed bug 1974314 to consider a design for testBeforeNextSubmit/testSubmission that will allow us to unskip it in the future.
| Assignee | ||
Comment 5•10 months ago
|
||
| Assignee | ||
Comment 6•10 months ago
|
||
Comment 9•10 months ago
|
||
The patch landed in nightly and beta is affected.
:chutten, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- See https://wiki.mozilla.org/Release_Management/Requesting_an_Uplift for documentation on how to request an uplift.
- If no, please set
status-firefox141towontfix.
For more information, please visit BugBot documentation.
| Assignee | ||
Comment 11•10 months ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D255309
Updated•10 months ago
|
Comment 12•10 months ago
|
||
firefox-beta Uplift Approval Request
- User impact if declined: None
- Code covered by automated testing: yes
- Fix verified in Nightly: yes
- Needs manual QE test: no
- Steps to reproduce for manual QE testing: n/a
- Risk associated with taking this patch: low/none
- Explanation of risk level: test-only change
- String changes made/needed: none
- Is Android affected?: no
Comment 13•10 months ago
|
||
Chris, could you rebase your patch to the beta branch? It doesn't apply cleanly to the branch. Thanks!
| Assignee | ||
Comment 14•10 months ago
|
||
Sure thing, I've updated https://phabricator.services.mozilla.com/D255505 to be based on origin/beta
Updated•10 months ago
|
Updated•10 months ago
|
Comment 15•10 months ago
|
||
| uplift | ||
| Reporter | ||
Comment 16•9 months ago
|
||
I'm so sorry but... I was on PTO when this landed. Just now I did another artifact trypush off this m-c rev from this morning, and push health shows 6 failures (at time of writing) in this test. The failures look like:
TEST-UNEXPECTED-FAIL | toolkit/components/telemetry/tests/unit/test_ThirdPartyModulesPing.js | xpcshell return code: 0
TEST-UNEXPECTED-FAIL | toolkit/components/telemetry/tests/unit/test_ThirdPartyModulesPing.js | test_new_old_instances - [test_new_old_instances : 300] Missing expected exception New instances should not exist!
That's from this test assertion:
await Assert.rejects(
Telemetry.getUntrustedModuleLoadEvents(),
e => e.result == Cr.NS_ERROR_NOT_AVAILABLE,
"New instances should not exist!"
);
The patch skipped the first task, but not the others. Is this correct or should the whole test be skipped in the manifest?
| Assignee | ||
Comment 17•9 months ago
|
||
That test doesn't exercise Glean code and thus should be as valid now as it ever was (aSubmitGleanPing is false and aFlags is non-0)...unless it relied on test_send_ping running first in order to work?
On a cost-benefit, I expect investigating this to cost more than the benefit of running down exactly what's happening in this one test, so I'd favourably regard a patch that disables the file on artifact.
Description
•