Some gecko-t-win64-aarch64-laptop don't have a complete python3 install
Categories
(Infrastructure & Operations :: RelOps: OpenCloudConfig, defect)
Tracking
(Not tracked)
People
(Reporter: glandium, Assigned: grenade)
References
Details
Attachments
(1 file, 1 obsolete file)
+++ This bug was initially created as a clone of Bug #1545339 +++
When bug 1525373 landed, it caused similar problems as those that me file bug 1545339... but on the gecko-t-win64-aarch64-laptop workers.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
this patch updates arm64 laptops to use the python 3 msi installers in the same way as the windows 7 32 bit workers do (windows aarch64 systems are compatible with x86 (32 bit) software binaries).
a slight complication is that we need to add a few arguments to the msi install command:
InstallAllUsers=1
TargetDir=C:\mozilla-build\python3
windows 10 hardware workers (both x86_64 and aarch64) bypass dsc for their software installations and use a custom install mechanism from the occ powershell module which is a few commits ahead of occ master, in the occ gamma branch.
i have already updated the module to support adding the additional msi install arguments.
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
apologies for the slow response here. the problem is due to the fact that the aarch64 systems are not running occ at boot and as such don't pick up the new configuration which would use the python 3 installer scripts.
someone at bitbar will have to manually run occ on these systems or we wait for the migration to ronin (bug 1530414) on these systems.
Reporter | ||
Comment 4•5 years ago
|
||
Assignee | ||
Comment 6•5 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #5)
Any update here?
no.
the aarch64 occ implementation is broken. we don't have a way to update them without someone at bitbar manually running commands on each instance.
this is compounded by the fact that we don't intend to fix occ on these and are migrating these workers to ronin puppet.
we don't yet have puppet manifests for aarch64 since this platform uses x86 rather than x86_64 installers which we still need to write.
so there are still a few yaks to shave before this will be fixed.
Reporter | ||
Comment 7•5 years ago
|
||
Apparently, we have been able to fix the x86-64 hardware at bitbar in bug 1569091. Can we do the same somehow here?
Assignee | ||
Comment 9•5 years ago
•
|
||
i have been patching yoga systems today to make them run occ between tasks (which also causes them to pick up the python 3 install).
the following aarch64 systems have picked up the patch:
- yoga-001
- yoga-002
- yoga-003
- yoga-004
- yoga-005
- yoga-008
- yoga-009
- yoga-010
- yoga-011
- yoga-012
- yoga-014
- yoga-015
- yoga-016
- yoga-017
- yoga-018
- yoga-019
- yoga-020
- yoga-021
- yoga-022
- yoga-023
- yoga-027
this represents 21 out of 35 systems. i will check back in the morning to see if all systems have picked up the patch. if not we might need manual intervention at bitbar on the remaining systems.
Assignee | ||
Comment 10•5 years ago
|
||
none of the remaining 14 systems picked up the patch overnight so i am going to attempt to force them to do so. if that fails, i will ask bitbar to intervene when the sun comes up over santa monica boulevard...
Assignee | ||
Comment 11•5 years ago
|
||
the remaining working systems have been patched.
Comment 12•5 years ago
|
||
Comment on attachment 9080162 [details]
Bug 1557614 - Enable run-task on aarch64 laptop workers.
Revision D39100 was moved to bug 1578963. Setting attachment 9080162 [details] to obsolete.
Reporter | ||
Comment 13•5 years ago
|
||
The gecko-t-win64-aarch64-laptop workers still have a busted python 3. Example from today: https://tools.taskcluster.net/groups/ROHjH_hWT8a2ihaxnrsPUw/tasks/SDImf8bNTxGKjXojr1Y3PQ/runs/0/logs/public%2Flogs%2Flive.log
Assignee | ||
Comment 15•5 years ago
|
||
i believe i have found the issues.
- on at least 1 system (yoga-026), occ has not run and has not installed python3. generic worker is continuously rebooting the system before it has had a chance to run occ. i believe this is due to an outdated gw-wrapper script which i will attempt to patch using the elevated task mechanism.
- on many of the rest of these systems, python 3.7 was installed by occ, however it was installed over the top of a previous python 3.6 installation. the 3.6 install includes
C:\mozilla-build\python3\python3.exe
. the 3.7 install does not. it includes onlyC:\mozilla-build\python3\python.exe
. this explains the broken python behaviours, since the task linked in comment 13 above unwittingly calls the broken python 3.6 install. i have patched this by using elevated tasks to first removeC:\mozilla-build\python3\python3.exe
and then replace it with a symlink toC:\mozilla-build\python3\python.exe
.
on a final note, i see that the task in comment 13 appears to invoke python 2.7. i don't know if this is intentional or not but the command reads:
C:/mozilla-build/python3/python3.exe run-task -- c:\mozilla-build\python\python.exe -u mozharness\scripts\desktop_unittest.py ...
which to my reasoning looks like a python 3 call wrapping a python 2 call. this might be what is intended or it might be something else.
Assignee | ||
Comment 16•5 years ago
•
|
||
the following yoga systems are awol from the worker explorer and not taking tasks:
- yoga-002
- yoga-031
- yoga-032
if someone at bitbar has a moment, please run the following command from an elevated powershell prompt on the above machines (they should reboot themselves when it completes):
iex (New-Object Net.WebClient).DownloadString('https://raw.githubusercontent.com/mozilla-releng/OpenCloudConfig/master/userdata/rundsc.ps1')
Comment 17•2 years ago
|
||
backlog cleanup
Description
•