Intermittent Windows Marionette Return code: 10
Categories
(Testing :: Marionette Client and Harness, defect, P5)
Tracking
(firefox132 fixed)
Tracking | Status | |
---|---|---|
firefox132 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: jdescottes)
References
Details
(Keywords: intermittent-failure, Whiteboard: [webdriver:m12][retriggered])
Attachments
(1 file)
Filed by: csabou [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=394617514&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/KBV8B0OfTMmWo3ZbCco2Sg/runs/0/artifacts/public/logs/live_backing.log
[task 2022-10-27T23:09:43.867Z] 23:09:43 INFO - DEBUG: Starting phase profile-before-change
[task 2022-10-27T23:09:43.873Z] 23:09:43 INFO - DEBUG: Spinning the event loop
[task 2022-10-27T23:09:43.874Z] 23:09:43 INFO - DEBUG: Completed blocker EnvironmentAddonBuilder for phase AddonManager: Waiting to start provider shutdown.
[task 2022-10-27T23:09:43.874Z] 23:09:43 INFO - DEBUG: Completed blocker PageActions: purging unregistered actions from cache for phase profile-before-change
[task 2022-10-27T23:09:43.874Z] 23:09:43 INFO - DEBUG: Completed blocker Places Expiration: shutdown for phase Places Connection shutdown
[task 2022-10-27T23:09:43.875Z] 23:09:43 INFO - DEBUG: Completed blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2022-10-27T23:09:43.876Z] 23:09:43 INFO - DEBUG: Completed blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2022-10-27T23:09:43.876Z] 23:09:43 INFO - DEBUG: Completed blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2022-10-27T23:09:43.876Z] 23:09:43 INFO - DEBUG: Completed blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2022-10-27T23:09:43.877Z] 23:09:43 INFO - DEBUG: Completed blocker RemoteSettingsClient - finish IDB access. for phase profile-before-change
[task 2022-10-27T23:09:43.877Z] 23:09:43 INFO - DEBUG: Completed blocker Remote Settings profile-before-change for phase profile-before-change
[task 2022-10-27T23:09:43.877Z] 23:09:43 INFO - DEBUG: Completed blocker DownloadAutoSaveView: writing data for phase profile-before-change
[task 2022-10-27T23:09:43.878Z] 23:09:43 INFO - DEBUG: Completed blocker Search service: shutting down for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2022-10-27T23:09:43.878Z] 23:09:43 INFO - DEBUG: Completed blocker places.sqlite#1: waiting for shutdown for phase Sqlite.sys.mjs: wait until all connections are closed
[task 2022-10-27T23:09:43.878Z] 23:09:43 INFO - DEBUG: Completed blocker OS.File: flush I/O queued before profileBeforeChange for phase profile-before-change
[task 2022-10-27T23:09:43.906Z] 23:09:43 INFO - DEBUG: Completed blocker PlacesUtils wrapped connection closing as part of Places shutdown for phase Places Connection shutdown
[task 2022-10-27T23:09:43.914Z] 23:09:43 INFO - DEBUG: Completed blocker PlacesUtils wrapped connection must be closed before Sqlite.sys.mjs for phase Sqlite.sys.mjs: wait until all clients have completed their task
[task 2022-10-27T23:09:43.915Z] 23:09:43 INFO - DEBUG: Completed blocker SitePermsAddonProvider for phase AddonManager: Waiting for providers to shut down.
[task 2022-10-27T23:09:43.915Z] 23:09:43 INFO - DEBUG: Completed blocker GMPProvider for phase AddonManager: Waiting for providers to shut down.
[task 2022-10-27T23:09:43.915Z] 23:09:43 INFO - DEBUG: Completed blocker XPIProvider for phase AddonManager: Waiting for providers to shut down.
[task 2022-10-27T23:09:43.915Z] 23:09:43 INFO - DEBUG: Completed blocker AddonManager: shutting down. for phase profile-before-change
[task 2022-10-27T23:09:43.916Z] 23:09:43 INFO - DEBUG: Completed blocker CrashMonitor: Writing notifications to file after receiving profile-before-change and awaiting all checkpoints written for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2022-10-27T23:09:43.922Z] 23:09:43 INFO - DEBUG: Completed blocker content-prefs.sqlite#0: waiting for shutdown for phase Sqlite.sys.mjs: wait until all connections are closed
[task 2022-10-27T23:09:43.924Z] 23:09:43 INFO - DEBUG: Completed blocker Closing ContentPrefService2 connection. for phase Sqlite.sys.mjs: wait until all clients have completed their task
[task 2022-10-27T23:09:43.924Z] 23:09:43 INFO - DEBUG: Completed blocker places.sqlite#0: waiting for shutdown for phase Sqlite.sys.mjs: wait until all connections are closed
[task 2022-10-27T23:09:43.924Z] 23:09:43 INFO - DEBUG: Completed blocker PlacesUtils read-only connection closing as part of Places shutdown for phase Places Connection shutdown
[task 2022-10-27T23:09:43.924Z] 23:09:43 INFO - DEBUG: Completed blocker PlacesUtils read-only connection must be closed before Sqlite.sys.mjs for phase Sqlite.sys.mjs: wait until all clients have completed their task
[task 2022-10-27T23:09:43.927Z] 23:09:43 INFO - DEBUG: Completed blocker Sqlite.sys.mjs shutdown blocker for phase profile-before-change
[task 2022-10-27T23:09:43.937Z] 23:09:43 INFO - DEBUG: Finished phase profile-before-change
[task 2022-10-27T23:09:43.958Z] 23:09:43 INFO - DEBUG: Starting phase profile-before-change-telemetry
[task 2022-10-27T23:09:43.964Z] 23:09:43 INFO - DEBUG: Spinning the event loop
[task 2022-10-27T23:09:44.247Z] 23:09:44 INFO - DEBUG: Adding blocker Waiting for ping task for phase TelemetryController: Waiting for pending ping activity
[task 2022-10-27T23:09:44.300Z] 23:09:44 INFO - DEBUG: Adding blocker Waiting for ping task for phase TelemetryController: Waiting for pending ping activity
[task 2022-10-27T23:09:44.322Z] 23:09:44 INFO - DEBUG: Completed blocker Waiting for ping task for phase TelemetryController: Waiting for pending ping activity
[task 2022-10-27T23:09:44.328Z] 23:09:44 INFO - DEBUG: Completed blocker Waiting for ping task for phase TelemetryController: Waiting for pending ping activity
[task 2022-10-27T23:09:44.340Z] 23:09:44 INFO - DEBUG: Completed blocker TelemetryController: shutting down for phase profile-before-change-telemetry
[task 2022-10-27T23:09:44.341Z] 23:09:44 INFO - DEBUG: Finished phase profile-before-change-telemetry
[task 2022-10-27T23:09:44.342Z] 23:09:44 INFO - DEBUG: Starting phase xpcom-will-shutdown
[task 2022-10-27T23:09:44.342Z] 23:09:44 INFO - DEBUG: Spinning the event loop
[task 2022-10-27T23:09:44.346Z] 23:09:44 INFO - DEBUG: Completed blocker OS.File: flush pending requests, warn about unclosed files, shut down service. for phase xpcom-will-shutdown
[task 2022-10-27T23:09:44.346Z] 23:09:44 INFO - DEBUG: Finished phase xpcom-will-shutdown
[task 2022-10-27T23:09:44.349Z] 23:09:44 INFO - DEBUG: Starting phase web-workers-shutdown
[task 2022-10-27T23:09:44.349Z] 23:09:44 INFO - DEBUG: Spinning the event loop
[task 2022-10-27T23:09:44.350Z] 23:09:44 INFO - DEBUG: Finished phase web-workers-shutdown
[task 2022-10-27T23:09:46.561Z] 23:09:46 ERROR - Return code: 10
[task 2022-10-27T23:09:46.561Z] 23:09:46 INFO - TinderboxPrint: marionette<br/>1013/<em class="testfail">1</em>/36
[task 2022-10-27T23:09:46.561Z] 23:09:46 INFO - gecko.log not found
[task 2022-10-27T23:09:46.561Z] 23:09:46 INFO - TinderboxPrint: marionette<br/>1013/<em class="testfail">1</em>/36
[task 2022-10-27T23:09:46.561Z] 23:09:46 INFO - Marionette exited with return code 10: WARNING
[task 2022-10-27T23:09:46.561Z] 23:09:46 WARNING - # TBPL WARNING #
[task 2022-10-27T23:09:46.561Z] 23:09:46 WARNING - setting return code to 1
[task 2022-10-27T23:09:46.561Z] 23:09:46 INFO - Running post-action listener: _package_coverage_data
[task 2022-10-27T23:09:46.561Z] 23:09:46 INFO - Running post-action listener: _resource_record_post_action
[task 2022-10-27T23:09:46.562Z] 23:09:46 INFO - Running post-action listener: process_java_coverage_data
[task 2022-10-27T23:09:46.562Z] 23:09:46 INFO - [mozharness: 2022-10-27 23:09:46.562432Z] Finished run-tests step (success)
[task 2022-10-27T23:09:46.562Z] 23:09:46 INFO - Running post-run listener: _resource_record_post_run
[task 2022-10-27T23:09:46.762Z] 23:09:46 INFO - Total resource usage - Wall time: 2775s; CPU: 42%; Read bytes: 306483200; Write bytes: 6099653632; Read time: 0; Write time: 149
[task 2022-10-27T23:09:46.762Z] 23:09:46 INFO - TinderboxPrint: CPU usage<br/>41.7%
[task 2022-10-27T23:09:46.762Z] 23:09:46 INFO - TinderboxPrint: I/O read bytes / time<br/>306,483,200 / 0
[task 2022-10-27T23:09:46.762Z] 23:09:46 INFO - TinderboxPrint: I/O write bytes / time<br/>6,099,653,632 / 149
[task 2022-10-27T23:09:46.762Z] 23:09:46 INFO - TinderboxPrint: CPU idle<br/>12,960.9 (58.3%)
[task 2022-10-27T23:09:46.762Z] 23:09:46 INFO - TinderboxPrint: CPU system<br/>2,466.9 (11.1%)
[task 2022-10-27T23:09:46.762Z] 23:09:46 INFO - TinderboxPrint: CPU user<br/>6,772.3 (30.5%)
[task 2022-10-27T23:09:46.775Z] 23:09:46 INFO - install - Wall time: 10s; CPU: 25%; Read bytes: 73728; Write bytes: 394252288; Read time: 0; Write time: 25
[task 2022-10-27T23:09:46.787Z] 23:09:46 INFO - run-tests - Wall time: 2766s; CPU: 42%; Read bytes: 305488896; Write bytes: 5612545024; Read time: 0; Write time: 122
[task 2022-10-27T23:09:47.431Z] 23:09:47 WARNING - returning nonzero exit status 1
[taskcluster 2022-10-27T23:09:47.514Z] Exit Code: 1
[taskcluster 2022-10-27T23:09:47.514Z] User Time: 31.25ms
[taskcluster 2022-10-27T23:09:47.514Z] Kernel Time: 15.625ms
[taskcluster 2022-10-27T23:09:47.514Z] Wall Time: 47m43.1632606s
[taskcluster 2022-10-27T23:09:47.514Z] Result: FAILED
[taskcluster 2022-10-27T23:09:47.514Z] === Task Finished ===
[taskcluster 2022-10-27T23:09:47.514Z] Task Duration: 47m43.1666893s
[taskcluster 2022-10-27T23:09:47.611Z] Uploading artifact public/logs/localconfig.json from file logs\localconfig.json with content encoding "gzip", mime type "application/json" and expiry 2023-01-25T22:02:42.793Z
[taskcluster 2022-10-27T23:09:47.772Z] Uploading artifact public/test_info/marionette_errorsummary.log from file build\blobber_upload_dir\marionette_errorsummary.log with content encoding "gzip", mime type "text/plain" and expiry 2023-01-25T22:02:42.793Z
[taskcluster 2022-10-27T23:09:48.602Z] Uploading artifact public/test_info/marionette_raw.log from file build\blobber_upload_dir\marionette_raw.log with content encoding "gzip", mime type "text/plain" and expiry 2023-01-25T22:02:42.793Z
[taskcluster 2022-10-27T23:09:48.888Z] Uploading artifact public/test_info/report.html from file build\blobber_upload_dir\report.html with content encoding "gzip", mime type "text/html; charset=utf-8" and expiry 2023-01-25T22:02:42.793Z
[taskcluster 2022-10-27T23:09:48.995Z] Uploading artifact public/test_info/resource-usage.json from file build\blobber_upload_dir\resource-usage.json with content encoding "gzip", mime type "application/json" and expiry 2023-01-25T22:02:42.793Z
[taskcluster 2022-10-27T23:09:49.481Z] Uploading link artifact public/logs/live.log to artifact public/logs/live_backing.log with expiry 2023-01-25T22:02:42.793Z
[taskcluster:error] exit status 1
Comment 1•2 years ago
|
||
The Marionette harness will exit with an code of 10
when tests were failing. Interestingly I see:
[task 2022-10-27T23:09:42.639Z] 23:09:42 INFO - passed: 1013
[task 2022-10-27T23:09:42.639Z] 23:09:42 INFO - failed: 1
[task 2022-10-27T23:09:42.639Z] 23:09:42 INFO - todo: 36 (skipped: 30)
[task 2022-10-27T23:09:42.640Z] 23:09:42 INFO - FAILED TESTS
But there is no failure listed in the parsed log. No idea what's broken here given that we haven't changed anything in that code for quite some time. Maybe there was a Python upgrade recently for test workers which might have caused that behavior?
Comment 2•2 years ago
|
||
the base win10 image hasn't changed. As this is intermittent and on ASAN only, this makes me wonder if we have some timing issue. For successful runs I see test summary of 1012/0/36 (pass/fail/skip-todo), but for the failures I see (1013/1/36). I don't know if that means anything.
Quite likely the browser could have changed some timing in shutdown.
This job (win10/asan-marionette) fails 12.5% of the time in the last month, it is broken down in these bugs:
bug id | frequency |
---|---|
1782045 | 10 |
1791299 | 9 |
1800035 | 6 |
1797552 | 2 |
1792855 | 2 |
1691876 | 2 |
1790942 | 2 |
1795485 | 1 |
1798158 | 1 |
1413849 | 1 |
1780602 | 1 |
1669403 | 1 |
1727152 | 1 |
while this bug isn't the most frequent, it might be the most confusing.
Comment hidden (Intermittent Failures Robot) |
Comment 4•2 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #2)
successful runs I see test summary of 1012/0/36 (pass/fail/skip-todo), but for the failures I see (1013/1/36). I don't know if that means anything.
That's interesting. Why do we have one time 1012 tests and another time 1014 tests that have been running? It might be good to check and compare different logs from passing and failing jobs to see which tests are run on top in failing cases. But I would defer that until the failure rate is high enough.
Comment 5•2 years ago
|
||
Comment hidden (Intermittent Failures Robot) |
Comment 7•2 years ago
|
||
I cannot see backfill jobs but only Mn
ones. Given that on autoland the test order might change and have an affect I triggered new backfills now:
Comment 8•2 years ago
|
||
No new conclusion from running additional jobs. It's still the above changelog which isn't helpful. I doubt the change to devtools might have caused that.
Whenever checking the full log there is a single failure present for:
[task 2022-11-15T00:12:54.928Z] 00:12:54 INFO - TEST-FAIL | testing/marionette/harness/marionette_harness/tests/unit/test_expectedfail.py TestFail.test_fails | took 135ms
But as you can see this one is expected. So not sure what's causing us to fail with an exit code of 10 at all.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 12•2 years ago
|
||
Maybe the following errors for Telemetry during shutdown might not be related to this particular error in Marionette but might warrant investigation and separate bugs to be filed.
https://treeherder.mozilla.org/logviewer?job_id=399246634&repo=mozilla-central&lineNumber=155239
Chris, could someone from the Telemetry team please have a look? Thanks.
Comment 13•2 years ago
|
||
I actually filed bug 1805153 and moved the needinfo over there.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•2 years ago
|
Comment 22•2 years ago
|
||
Comment hidden (Intermittent Failures Robot) |
Comment 24•2 years ago
|
||
https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
For more information, please visit auto_nag documentation.
Reporter | ||
Comment 25•2 years ago
|
||
treeherder |
New failure instance: https://treeherder.mozilla.org/logviewer?job_id=407365560&repo=autoland
Comment hidden (Intermittent Failures Robot) |
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 92•7 months ago
|
||
This seems to be a mozharness related issue given that it surfaces around log parsing for tinderbox. We have two lines of code that could cause the return code of 10
:
https://searchfox.org/mozilla-central/source/testing/mozharness/mozharness/base/script.py#1664 (run_command
)
https://searchfox.org/mozilla-central/source/testing/mozharness/mozharness/base/script.py#1870 (get_output_from_command
)
I assume that the failure might come from parsing the log files and checking for failures.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 102•5 months ago
|
||
This actually happens when the Marionette harness sends a TEST-START
but misses to send a TEST-PASS
or TEST-FAIL
as it can be seen here for the test layout/base/tests/marionette/test_accessiblecaret_selection_mode.py AccessibleCaretSelectionModeTestCase.test_handle_tilt_when_carets_overlap_each_other_textarea_rtl
:
https://treeherder.mozilla.org/logviewer?job_id=471272624&repo=autoland&lineNumber=132339
[task 2024-08-22T00:05:46.635Z] 00:05:46 INFO - TEST-START | layout/base/tests/marionette/test_accessiblecaret_selection_mode.py AccessibleCaretSelectionModeTestCase.test_handle_tilt_when_carets_overlap_each_other_textarea_rtl
[..]
[task 2024-08-22T00:05:48.292Z] 00:05:48 INFO - 1724285148291 Marionette TRACE [20] MarionetteCommands actor destroyed for window id 12884901939
[task 2024-08-22T00:05:48.295Z] 00:05:48 INFO - 1724285148295 Marionette DEBUG 53 <- [1,97,null,{"value":null}]
[task 2024-08-22T00:05:48.296Z] 00:05:48 INFO - TEST-START | layout/base/tests/marionette/test_accessiblecaret_selection_mode.py AccessibleCaretSelectionModeTestCase.test_long_press_changes_focus_from_content_to_contenteditable
When this job fails it's actually always about this test. When I check the test file this specific test is generated via parameterization. As it looks like the test works fine and should have passed but whether we fail a check or miss to actually print the status.
Updated•5 months ago
|
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•5 months ago
|
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 105•5 months ago
|
||
Interestingly, when forcing a random error in the test, the failure is correctly picked up by the harness.
However when forcing an error with self.assertEqual(target_content, ...)
, then we have the same issue: the job fails but the error is not reported. The error message that should be printed in this case is:
TEST_END: FAIL, expected PASS - AssertionError: ' فيرفكس موزيلا فيرفكس' != 'AAA'
- فيرفكس موزيلا فيرفكس
+ AAA
(AAA is just a random string I am expecting here to make the test fail). I wonder if on windows the test harness fails to process the arabic characters in the error message.
Assignee | ||
Comment 106•5 months ago
|
||
Note that other failures for this test are tracked under Bug 1855083
Assignee | ||
Comment 107•5 months ago
|
||
Just to confirm: this is a windows only issue triggered by the fact that we are unable to print the failure message containing arabic characters. I am trying at the moment to force PYTHONIOENCODING=utf-8 in the environment to see if it helps. https://treeherder.mozilla.org/jobs?repo=try&revision=7331676ff70d01f5a3c2d27cc7c4801d04c0612b
Assignee | ||
Comment 108•5 months ago
|
||
Hi Alex,
Henrik suggested reaching out to you about such issues.
Here we have an issue where Python on windows CI jobs is not able to print a failure message containing unicode characters, for instance
TEST_END: FAIL, expected PASS - AssertionError: ' فيرفكس موزيلا فيرفكس' != 'AAA'
With this log missing, the log parser is unable to detect the proper failure.
What I tried above was to force env["PYTHONIOENCODING"] = "utf-8"
in testing/mozharness/scripts/marionette.py (changeset) and this seems to fix the issue, although the printed error message is a bit harder to read:
AssertionError: ' \u0641\u064a\u0631\u0641\u0643\u0633 \u0645\u0648\u0632\u064a\u0644\u0627 \u0641\u064a\u0631\u0641\u0643\u0633' != 'AAA'
(which reminds me of https://bugzilla.mozilla.org/show_bug.cgi?id=1902981 which I filed a few months ago)
So it seems that there is a problem when trying to print unicode characters on Windows CI. Do you have a suggestion on how to fix it properly (potentially different from what I did in the patch above?)
Talking with :jgraham, there is also a suggestion to use sys.stdout.reconfigure
to properly set the encoding
Comment 109•5 months ago
|
||
I think env["PYTHONIOENCODING"] = "utf-8"
seems like the best solution.
The sys.stdout.reconfigure(encoding='utf-8')
should also work, though.
There's also this you could play around with:
import locale
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
(But you might need set ar_SA.UTF-8
specifically for that test with that approach? Not sure).
Disclaimer: I've dealt with problems like this before, but I'm by no means an expert.
Comment 110•5 months ago
|
||
Hi Alex. I’m not an expert with UTF-8 either, but I’m curious about what the default encoding is on Windows. Do we have any custom settings for mozbuild? If not, could it be a solution to set the encoding via mozbuild, ensuring that it works not only on CI systems but also for users running tests locally on their machines?
Updated•5 months ago
|
Comment 111•5 months ago
•
|
||
On Windows I'd expect the default encoding to either be cp-1252
or utf-8
(depending on the OS version and Python version, but maybe also locale?).
Setting it to utf-8
in the MozillaBuild profile would have side effects on tools that do use the system encoding (I assumed this, but checked with :glandium to confirm).
I don't think there's a better way than explicitly setting the encoding to handle a particular case.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 113•4 months ago
|
||
Thanks for all the input, I think for now we should go with the initial approach at least to get the failures correctly classified from now on.
Assignee | ||
Comment 114•4 months ago
|
||
Error messages with unicode characters failed to be correctly printed on windows.
Forcing PYTHONIOENCODING to utf-8 avoids the issue.
Updated•4 months ago
|
Assignee | ||
Comment 115•4 months ago
|
||
Aryx: quick question related to this bug. Here the reason the issue was hard to investigate was that we had a TEST-START without a corresponding TEST-PASS
- or any other line indicating the final result for the test.
In general do you think this is something we should try to detect in logs, and trigger failures in those cases? Or are there tests for which it is expected to have TEST-START but not necessarily any result?
Comment 116•4 months ago
|
||
If there is a TEST-START
marker, there should also be a marker to identify the end of the logging block for the test if the test suite continues to be executed - if Firefox gets terminated/crashed and testing does not resume, there should be no such marker. End markers are e.g. TEST-OK
, TEST-SKIP
, TEST-UNEXPECTED-FAIL
, TEST-FAIL
.
In such a case of a crash of the application, a fatal error is supposed to get logged by the test suite and provide information.
In case of Marionette, it's a test step (TEST-PASS
) which does not get logged but the same concept applies: if test execution continues after failure, an end marker for the failed test should be logged.
Comment 117•4 months ago
|
||
Comment 118•4 months ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•4 months ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 123•4 months ago
|
||
All the recent failure classifications were incorrect and got re-classified. There is only one issue on esr115 which we are going to ignore.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Description
•