Some gecko-t-win7-32 workers don't have a complete python3 install
Categories
(Infrastructure & Operations :: RelOps: OpenCloudConfig, defect)
Tracking
(Not tracked)
People
(Reporter: glandium, Assigned: grenade)
References
Details
Attachments
(1 file)
While working on bug 1525373 I got weird failures that I went on to debug, and while doing so, I got even weirder behavior.
After further investigation, it turns out C:\mozilla-build\python3, on some workers, isn't complete.
On one worker, where I reproduced the problem with some small script printing out some information, the contents of that directory was:
['DLLs', 'Lib', 'libs', 'python3.exe', 'Scripts', 'tcl']
while on a run where everything went fine, the contents were:
['DLLs', 'Doc', 'include', 'Lib', 'libs', 'LICENSE.txt', 'NEWS.txt', 'python3.dll', 'python3.exe', 'python36.dll', 'pythonw.exe', 'Scripts', 'tcl', 'Tools', 'vcruntime140.dll']
Running python3.exe in the former just outputs "Exit Code: -1073741515"
| Reporter | ||
Comment 1•6 years ago
|
||
FWIW, the worker I got with the incomplete python3 was i-08c0fce6e303ca682.
| Reporter | ||
Comment 2•6 years ago
|
||
Another one: i-0b4d5d0ab0112d3da
| Reporter | ||
Updated•6 years ago
|
| Reporter | ||
Comment 3•6 years ago
|
||
There's another kind of broken workers that have a different set of files, and fail differently:
['DLLs', 'Lib', 'LICENSE.txt', 'NEWS.txt', 'python3.dll', 'python3.exe', 'python36.dll', 'pythonw.exe', 'Scripts', 'vcruntime140.dll']
Those fail with:
Fatal Python error: Py_Initialize: unable to load the file system codec
ModuleNotFoundError: No module named 'encodings'
Example of worker that I got in that situation: i-0c3fc6af7e360d426
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Comment 4•6 years ago
|
||
debugging today, i can see that the python install is very frequently failing on windows 7 with log messages like this:
May 07 14:33:04 i-0fbe0390616d5fac9.gecko-t-win7-32.euc1.mozilla.com occ-dsc: Invoke-LoggedCommandRun (Python3) :: command (C:\windows\Temp\0aecc2a136909051f4099015b9cc0ac52155160203e1cea2c82f397178818388c43d152b8678b98cd8a8d871626204909b21ba91b0d7aae566642f2f81570ebd.exe /quiet InstallAllUsers=1 TargetDir=C:\mozilla-build\python3) exited with code: 1603 after a processing time of: 00:00:00.0312002
| Reporter | ||
Comment 5•6 years ago
|
||
1603 is a generic "Fatal Error During Installation", which is not helpful as to what the hell is going on :(
MS site has a few suggestions of things to try:
https://support.microsoft.com/en-us/help/834484/you-receive-an-error-1603-a-fatal-error-occurred-during-installation
| Assignee | ||
Comment 6•6 years ago
|
||
this is a fairly ugly rabbit hole.
- on 64 bit windows, we install a newer version of mozilla-build which contains python 3
- on 32 bit windows, we install mozilla-build 2.2 which was the last version to support 32 bit systems and does not contain python 3
- we install python 3 using the exe installer which creates (among other things)
c:\mozilla-build\python3\python.exe - since we want
c:\mozilla-build\python3directory in the system path (see bug 1505057) but we also want python.exe calls to default to python 2, we renamec:\mozilla-build\python3\python.exetoc:\mozilla-build\python3\python3.exeso that calls to python are deterministic about which python version gets called - the python 3 installer in use (https://www.python.org/ftp/python/3.6.3/python-3.6.3.exe) has an interesting quirk in that it does not wait for the install to complete, before returning the session. this means that our bootstrap process thinks the python 3 install has completed, when it is in fact still in progress.
- because of this, our current incomplete installation errors are caused by the rename of python3\python.exe to python3\python3.exe while the python 3 installer is still running. the installer actually calls and runs python3\python.exe as part of the install process in order to build components of the install. that process is interrupted by the rename that is triggered before the python 3 install has completed.
i am testing several mechanisms to prevent the rename from occurring, before the install process has completed. the most promising of which is also the ugliest. eg:
python-3.6.3.exe /quiet /repair InstallAllUsers=1 TargetDir=C:\mozilla-build\python3 && sleep 120
| Assignee | ||
Comment 7•6 years ago
|
||
i tried a number of approaches to making occ wait for the python exe install to complete and none were reliable. the problem is that the installer spawns a number of background processes to install the various python components and there isn't a good mechanism for tracking completion on all of them.
what does work is to use the individual msi component installers and install them each individually. no msi installer that includes all components is provided so, as far as i can see, this is the only reliable mechanism that is available to us.
i have tested this on the windows 7 beta workers and consistently arrived at complete installations of python 3. all other mechanisms i tried resulted in roughly a 50/50 chance of a complete installation existing at the time of first task execution.
i figured that i would like this time investment in fixing the windows 7 python 3 installer for our infra, to last a little while so i opted to also update python 3 to python 3.7.3 which is the current recommended stable version (we were using 3.6.3), although the same mechanism would work for 3.6.3 if we want to do this all over again soon.
Updated•6 years ago
|
| Assignee | ||
Comment 8•6 years ago
|
||
| Assignee | ||
Comment 9•6 years ago
|
||
deployment is complete
| Reporter | ||
Comment 10•6 years ago
|
||
Description
•