Perma toolchain-macosx-custom-car - ModuleNotFoundError: No module named 'encodings' | Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Categories
(Firefox Build System :: Toolchains, defect, P5)
Tracking
(firefox124 fixed)
Tracking | Status | |
---|---|---|
firefox124 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Unassigned)
References
Details
(Keywords: intermittent-failure)
Filed by: abutkovits [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=443158742&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/OOiqPoPGTCyEd7uUOU4Ssw/runs/0/artifacts/public/logs/live_backing.log
[task 2024-01-12T20:36:12.841Z] is in build tree = 0
[task 2024-01-12T20:36:12.841Z] stdlib dir = '/opt/s/w/ir/x/w/3pp/wd/tools/cpython3/mac-amd64/3.11.5/out/lib/python3.11'
[task 2024-01-12T20:36:12.841Z] sys._base_executable = '/Users/task_169748250232954/Library/Caches/.vpython-root/store/python_venv-p8vln962kkg6gu1ia462t1rgkk/contents/bin/python3'
[task 2024-01-12T20:36:12.841Z] sys.base_prefix = '/opt/s/w/ir/x/w/3pp/wd/tools/cpython3/mac-amd64/3.11.5/out'
[task 2024-01-12T20:36:12.841Z] sys.base_exec_prefix = '/opt/s/w/ir/x/w/3pp/wd/tools/cpython3/mac-amd64/3.11.5/out'
[task 2024-01-12T20:36:12.841Z] sys.platlibdir = 'lib'
[task 2024-01-12T20:36:12.841Z] sys.executable = '/usr/local/bin/python3'
[task 2024-01-12T20:36:12.841Z] sys.prefix = '/opt/s/w/ir/x/w/3pp/wd/tools/cpython3/mac-amd64/3.11.5/out'
[task 2024-01-12T20:36:12.841Z] sys.exec_prefix = '/opt/s/w/ir/x/w/3pp/wd/tools/cpython3/mac-amd64/3.11.5/out'
[task 2024-01-12T20:36:12.841Z] sys.path = [
[task 2024-01-12T20:36:12.841Z] '/opt/s/w/ir/x/w/3pp/wd/tools/cpython3/mac-amd64/3.11.5/out/lib/python311.zip',
[task 2024-01-12T20:36:12.841Z] '/opt/s/w/ir/x/w/3pp/wd/tools/cpython3/mac-amd64/3.11.5/out/lib/python3.11',
[task 2024-01-12T20:36:12.841Z] '/opt/s/w/ir/x/w/3pp/wd/tools/cpython3/mac-amd64/3.11.5/out/lib/python3.11/lib-dynload',
[task 2024-01-12T20:36:12.841Z] ]
[task 2024-01-12T20:36:12.841Z] Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
[task 2024-01-12T20:36:12.841Z] Python runtime state: core initialized
[task 2024-01-12T20:36:12.841Z] ModuleNotFoundError: No module named 'encodings'
[task 2024-01-12T20:36:12.841Z]
[task 2024-01-12T20:36:12.841Z] Current thread 0x00000001163c6dc0 (most recent call first):
[task 2024-01-12T20:36:12.841Z] <no Python frame>
[taskcluster 2024-01-12T20:36:12.851Z] Exit Code: 1
[taskcluster 2024-01-12T20:36:12.851Z] User Time: 2m18.637991s
[taskcluster 2024-01-12T20:36:12.851Z] Kernel Time: 2m13.17151s
[taskcluster 2024-01-12T20:36:12.851Z] Wall Time: 1m8.086922s
[taskcluster 2024-01-12T20:36:12.851Z] Result: FAILED
[taskcluster 2024-01-12T20:36:12.851Z] === Task Finished ===
[taskcluster 2024-01-12T20:36:12.851Z] Task Duration: 1m8.091122s
[taskcluster:error] Uploading error artifact public/build from file public/build with message "Could not read directory '/opt/worker/tasks/task_169748250232954/public/build'", reason "file-missing-on-worker" and expiry 2024-04-11T20:33:54.675Z
[taskcluster:error] TASK FAILURE during artifact upload: file-missing-on-worker: Could not read directory '/opt/worker/tasks/task_169748250232954/public/build'
[taskcluster 2024-01-12T20:36:12.989Z] Uploading artifact public/logs/certified.log from file /opt/worker/tasks/task_169748250232954/generic-worker/certified.log with content encoding "gzip", mime type "text/plain; charset=utf-8" and expiry 2024-04-11T20:33:54.675Z
[taskcluster 2024-01-12T20:36:13.324Z] Uploading artifact public/chain-of-trust.json from file /opt/worker/tasks/task_169748250232954/generic-worker/chain-of-trust.json with content encoding "gzip", mime type "text/plain; charset=utf-8" and expiry 2024-04-11T20:33:54.675Z
[taskcluster 2024-01-12T20:36:13.573Z] Uploading artifact public/chain-of-trust.json.sig from file /opt/worker/tasks/task_169748250232954/generic-worker/chain-of-trust.json.sig with content encoding "gzip", mime type "application/octet-stream" and expiry 2024-04-11T20:33:54.675Z
[taskcluster 2024-01-12T20:36:13.792Z] [mounts] Preserving cache: Moving "/opt/worker/tasks/task_169748250232954/checkouts" to "/opt/worker/cache/MoCaUrYPRpCz8IJ3NQZA3w"
[taskcluster 2024-01-12T20:36:13.793Z] [mounts] Denying task_169748250232954 access to '/opt/worker/cache/MoCaUrYPRpCz8IJ3NQZA3w'
[taskcluster 2024-01-12T20:36:13.865Z] Uploading link artifact public/logs/live.log to artifact public/logs/live_backing.log with expiry 2024-04-11T20:33:54.675Z
[taskcluster:error] exit status 1
Updated•1 year ago
|
Updated•1 year ago
|
Comment 1•1 year ago
|
||
I see this happened on Jan 4 as well
https://treeherder.mozilla.org/jobs?repo=mozilla-central&selectedTaskRun=UQ7LC1-XRTmFdHJ9ptFuzQ.1&tier=1%2C2%2C3&searchStr=m-car%2Cmac%2Ctool&revision=78e69b0bee0713262f9b80b8cb36b8e74539d97e
but resolved itself shortly after?
That time it was triaged under Bug 1825788
Comment 2•1 year ago
|
||
(In reply to Kash Shampur [:kshampur] ⌚EST from comment #1)
I see this happened on Jan 4 as well
https://treeherder.mozilla.org/jobs?repo=mozilla-central&selectedTaskRun=UQ7LC1-XRTmFdHJ9ptFuzQ.1&tier=1%2C2%2C3&searchStr=m-car%2Cmac%2Ctool&revision=78e69b0bee0713262f9b80b8cb36b8e74539d97e
but resolved itself shortly after?That time it was triaged under Bug 1825788
Okay so probably this python 3.11 stuff upstream.
Jan. 4 they landed this 3.11 upgrade then reverted it shortly after so thats why it was passing between jan 5-11.
And today they landed a 3.11 upgrade again
Comment hidden (Intermittent Failures Robot) |
Comment 4•1 year ago
|
||
This is not a Python bug, this is a symptom of setting PYTHONHOME and/or PYTHONPATH when they’re not needed.
Are we setting those, or are upstream scripts doing it?
Comment 5•1 year ago
|
||
we are not setting it as far as I can tell. As for upstream this is the only recent (~this month) change I found related to PYTHONPATH:
Also looking at the full log
[task 2024-01-12T20:36:12.841Z] PYTHONHOME = (not set)
[task 2024-01-12T20:36:12.841Z] PYTHONPATH = (not set)
It's not set either.
leaving ni? as I keep looking into this
Comment 6•1 year ago
•
|
||
I wonder if this is an issue as well
[task 2024-01-12T20:36:12.841Z] sys.executable = '/usr/local/bin/python3'
It is detecting the system python, i don't think that should be the case.
this change in the past I thought should circumvent that
Updated•1 year ago
|
Comment 7•1 year ago
|
||
This one is weird. As far as I can tell PYTHONHOME/PYTHONPATH are untouched.
I have a suspicion that the system python is interfering somehow but I am not 100% sure how/why and if it really is the culprit. (and it is old, at 3.7.x)
using the fetched python toolchain or the depot_tools bundled python in the PATH doesn't seem to help either.
:glandium would you know if it is possible to upgrade the system python on the b-osx worker? and/or how much effort/disruption that would be? (or is this more a releng/ops question?)
Comment 8•1 year ago
|
||
minor update: by setting PYTHONEXECUTABLE and using 3.11 toolchain I had a try run that made it past the failure point from this bug. (failed on something else that I am looking into)
Comment 9•1 year ago
|
||
Unsure how to make sense of this:
In this Try, I installed httplib2(which btw probably a bad idea sorry- I thought the machines would be "clean" between runs, but doesn't seem to be the case?)
first run there it made it up until the SSL issue (which i'd have to install certifi
for - but is it safe to do so with pip install? it doesn't seem to be that these machines are clean between runs)
second retrigger (same machine - 197) hits the httplib2 module not being there (which doesn't make sense since it is installed?) a quick google search reveals this is possible issue when theres many python versions on a system... Which is probably the case as there is 3.11 toolchain version present in the path as well as the 3.7 system, and previous runs would use 3.8 toolchain so I am sure there is stuff from that lingering somewhere
3rd retrigger (on machine 196 now) hits a pyparsing module missing. (so similar to certifi i'd have to install this too?
thoughts/questions:
- each retrigger having a different error makes this a bit tough
- what is the best way to correctly install the missing modules?
Comment 10•1 year ago
|
||
re: the SSL errors^ could we upgrade the openssl version on the machines? not entirely sure if that will help solve it, but is there any harm in doing the upgrade?
Comment hidden (Intermittent Failures Robot) |
Comment 12•1 year ago
|
||
quick update/summary
1- One scenario where I use the in-tree python 3.11.7
I have this error which usually in the past was solved by installing certifi
but that did not help this time. (it is possible it is an issue with the gclient script just ignoring the environment i set, but i am not 100% sure yet)
2- Another scenario where I instead use a bundled python version bootstrapped from chromium's depot_tools
. In this error here the python configuration there is a mix of the bootstrapped python (3.11.6
) as well as 3.8.10
. Why is 3.8.10 being picked up? I've tried setting environment vars (PATH, PYTHONPATH, PYTHONHOME, PYTHONEXECUTABLE) variables to attempt to override that but it does not seem to work.
That is not to say that in (2) there is no SSL error like in (1), it might still be there (but just behind by the configuration issue)
:sergesanspaille I recall you've dealt with some python/OSX machine issues in the past, this is a long shot and wondering if you might have insight into either of these scenarios with the python environment on these machines
Comment 13•1 year ago
|
||
One thing that looks suspicious is the fact that sys.executable
still points at /usr/local/bin/python
I've met that situation before and it was due to __PYVENV_LAUNCHER__
being set (that comes from brew
) could you give a try with __PYVENV_LAUNCHER__
unset?
Comment 14•1 year ago
•
|
||
As of writing this comment, unsetting __PYENV_LAUNCHER__
seems to have worked (try in progress), at the very least it reached the build stage of the script.
It is a little funny that it worked with the current in-tree set up (which uses the python 3.8 toolchain) but not the 3.11.7 toolchain or depot_tools bundled python (3.11.6)
Thanks :sergesanspaille! hopefully the build passes
edit: my mistake, this does actually work with the 3.11.7 toolchain but not the 3.11.6 bundled depot_tools python (the latter maybe just needs some tweaks).
But for now I will keep changes minimal in Bug 1876072 and keep using the 3.8 toolchain.
Comment 15•1 year ago
|
||
\o/ I'll propose a patch so that this gets more robust in the future and we don't trip on that randomly for OSX builders.
Comment 16•1 year ago
|
||
Green after Bug 1876072 reached central.
Let's deem this as fixed too.
Comment hidden (Intermittent Failures Robot) |
Description
•