Intermittent Android 7.0 failures Unsuccessful task run with exit code: 1 - python version is older than required
Categories
(Testing :: General, defect, P2)
Tracking
(Not tracked)
People
(Reporter: nataliaCs, Unassigned)
Details
[task 2023-12-20T03:07:05.688Z] Running: python3 /builds/worker/checkouts/gecko/mach python /builds/worker/workspace/mozharness/scripts/web_platform_tests.py --config-file /builds/worker/workspace/mozharness/configs/android/android-x86_64.py --config-file /builds/worker/workspace/mozharness/configs/web_platform_tests/prod_config_android.py --test-type=testharness --skip-implementation-status=backlog --skip-implementation-status=not-implementing --skip-timeout --skip-crash --exclude-tag=webgpu --exclude-tag=canvas --disable-fission --setpref=media.peerconnection.mtransport_process=false --setpref=network.process.enabled=false --setpref=layers.d3d11.enable-blacklist=false --download-symbols=ondemand
[task 2023-12-20T03:07:05.715Z] Python 3.8+ is required to run mach.
[task 2023-12-20T03:07:05.715Z] You are running Mach with Python 3.7.5
[task 2023-12-20T03:07:05.715Z] See https://firefox-source-docs.mozilla.org/setup/linux_build.html#installingpython
[task 2023-12-20T03:07:05.715Z] for guidance on how to install Python on your system.
[task 2023-12-20T03:07:05.718Z] cleanup
[task 2023-12-20T03:07:05.718Z] + cleanup
[task 2023-12-20T03:07:05.718Z] + local rv=1
[task 2023-12-20T03:07:05.718Z] + [[ -s /builds/worker/.xsession-errors ]]
[task 2023-12-20T03:07:05.718Z] + cp /builds/worker/.xsession-errors /builds/worker/artifacts/public/xsession-errors.log
[task 2023-12-20T03:07:05.720Z] + '[' ']'
[task 2023-12-20T03:07:05.720Z] + true
[task 2023-12-20T03:07:05.720Z] + cleanup_xvfb
[task 2023-12-20T03:07:05.720Z] ++ pidof Xvfb
[task 2023-12-20T03:07:05.722Z] + local xvfb_pid=56
[task 2023-12-20T03:07:05.722Z] + local vnc=false
[task 2023-12-20T03:07:05.722Z] + local interactive=false
[task 2023-12-20T03:07:05.722Z] + '[' -n 56 ']'
[task 2023-12-20T03:07:05.722Z] + [[ false == false ]]
[task 2023-12-20T03:07:05.722Z] + [[ false == false ]]
[task 2023-12-20T03:07:05.722Z] + kill 56
[task 2023-12-20T03:07:05.722Z] + screen -XS xvfb quit
[task 2023-12-20T03:07:05.724Z] + exit 1
[taskcluster 2023-12-20 03:07:08.370Z] === Task Finished ===
[taskcluster 2023-12-20 03:07:09.022Z] Unsuccessful task run with exit code: 1 completed in 100.902 seconds
Comment 1•2 years ago
|
||
It looks like the test tasks are using whatever checkout they might have around on the worker, which doesn't guarantee at all a tree that matches the current push. In this case, the worker probably had something from central or beta around, which requires python 3.8, while the push was from mozilla-release, and the docker images from mozilla-release don't have python 3.8, the ones from beta/central do.
Comment 2•2 years ago
|
||
bug 1843209 migrated to mozilla beta on Dec 18th, this was merge day and here we required python 3.8, previously mozilla-beta was running python 3.7.5. When it merged, the same task on beta used the ubuntu1804-test image (built from autoland 8 days earlier https://firefox-ci-tc.services.mozilla.com/tasks/X4ORcEmsSJaZx_7oyLhCIw ) Today the same docker image from autoland is still used on mozilla-beta for the wpt android lite tasks.
what is odd is retriggers are green on mozilla-release and other tasks. both the failing and the passing retriggers use the same docker image built from beta on december 7th (https://firefox-ci-tc.services.mozilla.com/tasks/cJCZU90jTsWHsJm1_2O-IA)
Why do we have different docker image bases? This seems like a task graph dependency issue?
looking at the above try push, you can see:
- wpt1 failed - task dependencies:
- P-IVao4hT6S_hV6ZFD5bgg
- TXQtv_ViRueEF565Ql5lOw
- a2FlXVSOQou0DeVAYyepPA
- cJCZU90jTsWHsJm1_2O-IA
- dcCAhtrQSNGfy68R7l0Dhw
- f2fTnEdAQ8yJOMMYn6dnPg
- fN6H5bi-QdWoyc8kIRrmwA
the task runs ./mach
:
[task 2023-12-20T03:07:05.686Z] + /builds/worker/bin/run-mozharness
[task 2023-12-20T03:07:05.688Z] Running: python3 /builds/worker/checkouts/gecko/mach python /builds/worker/workspace/mozharness/scripts/web_platform_tests.py --config-file /builds/worker/workspace/mozharness/configs/android/android-x86_64.py --config-file /builds/worker/workspace/mozharness/configs/web_platform_tests/prod_config_android.py --test-type=testharness --skip-implementation-status=backlog --skip-implementation-status=not-implementing --skip-timeout --skip-crash --exclude-tag=webgpu --exclude-tag=canvas --disable-fission --setpref=media.peerconnection.mtransport_process=false --setpref=network.process.enabled=false --setpref=layers.d3d11.enable-blacklist=false --download-symbols=ondemand
[task 2023-12-20T03:07:05.715Z] Python 3.8+ is required to run mach.
[task 2023-12-20T03:07:05.715Z] You are running Mach with Python 3.7.5
[task 2023-12-20T03:07:05.715Z] See https://firefox-source-docs.mozilla.org/setup/linux_build.html#installingpython
[task 2023-12-20T03:07:05.715Z] for guidance on how to install Python on your system.
[task 2023-12-20T03:07:05.718Z] cleanup
- [wpt1 passed}(https://firefox-ci-tc.services.mozilla.com/tasks/L97er2CcQcSdVnNc4ZMEag/definition) - task dependencies:
- OnR89MlTSlShCrdf2F_VSA <- new in retrigger, task label: Action: Retrigger
- P-IVao4hT6S_hV6ZFD5bgg
- TXQtv_ViRueEF565Ql5lOw
- a2FlXVSOQou0DeVAYyepPA
- cJCZU90jTsWHsJm1_2O-IA
- dcCAhtrQSNGfy68R7l0Dhw
- f2fTnEdAQ8yJOMMYn6dnPg
- fN6H5bi-QdWoyc8kIRrmwA
- XnAx3iZBTdGbje-69txv6Q <- new in retrigger, task label: Gecko Decision Task
the task doesn't run mach, but python harness:
[task 2023-12-20T03:23:24.691Z] + /builds/worker/bin/run-mozharness
[task 2023-12-20T03:23:24.693Z] Running: python3 /builds/worker/workspace/mozharness/scripts/web_platform_tests.py --config-file /builds/worker/workspace/mozharness/configs/android/android-x86_64.py --config-file /builds/worker/workspace/mozharness/configs/web_platform_tests/prod_config_android.py --test-type=testharness --skip-implementation-status=backlog --skip-implementation-status=not-implementing --skip-timeout --skip-crash --exclude-tag=webgpu --exclude-tag=canvas --disable-fission --setpref=media.peerconnection.mtransport_process=false --setpref=network.process.enabled=false --setpref=layers.d3d11.enable-blacklist=false --download-symbols=ondemand
[task 2023-12-20T03:23:24.910Z] 03:23:24 INFO - ConsoleLogger online at 20231220 03:23:24Z in /builds/worker/workspace
[task 2023-12-20T03:23:24.911Z] 03:23:24 INFO - Run as /builds/worker/workspace/mozharness/scripts/web_platform_tests.py --config-file /builds/worker/workspace/mozharness/configs/android/android-x86_64.py --config-file /builds/worker/workspace/mozharness/configs/web_platform_tests/prod_config_android.py --test-type=testharness --skip-implementation-status=backlog --skip-implementation-status=not-implementing --skip-timeout --skip-crash --exclude-tag=webgpu --exclude-tag=canvas --disable-fission --setpref=media.peerconnection.mtransport_process=false --setpref=network.process.enabled=false --setpref=layers.d3d11.enable-blacklist=false --download-symbols=ondemand
[task 2023-12-20T03:23:24.919Z] 03:23:24 INFO - Dumping config to /builds/worker/workspace/logs/localconfig.json.
[task 2023-12-20T03:23:24.921Z] 03:23:24 INFO - {'allow_software_gl_layers': False,
[task 2023-12-20T03:23:24.921Z] 03:23:24 INFO - 'android_version': 24,
[task 2023-12-20T03:23:24.921Z] 03:23:24 INFO - 'append_to_log': False,
[task 2023-12-20T03:23:24.921Z] 03:23:24 INFO - 'backlog': False,
I have no idea why web-platform-tests in this one case would be attempting to run mach instead of the python harness.
Comment 3•2 years ago
|
||
I have no idea why web-platform-tests in this one case would be attempting to run mach instead of the python harness.
Because of this:
https://searchfox.org/mozilla-central/rev/b580e3f77470b2337bc8ae032b58a85c11e66aba/taskcluster/scripts/tester/test-linux.sh#259
What is GECKO_PATH? Per the log:
[setup 2023-12-20T03:06:47.038Z] GECKO_PATH is /builds/worker/checkouts/gecko
And that directory comes from a cache:
[taskcluster 2023-12-20 03:05:28.120Z] using cache "gecko-level-3-checkouts-hg58-v3-c52fd4fedd061ea8b8e3" -> /builds/worker/checkouts
Nothing else in the log shows an explicit checkout happening. Unfortunately, the worker is not available anymore, so it's not possible to look at what specific previous task may have had a checkout, but that would be the only reason for the checkout to be there.
Comment 4•2 years ago
|
||
I am not sure of the best solution here. It sounds like we won't see this unless it happens on mozilla-release again, or in the future when there is a conflict of available python versions between python and mach.
we could put attributes in the tasks to indicate TASK_USE_CHECKOUT
, then set that as an environment variable such that test-linux.sh could only use mach if that environment variable is true.
:glandium, does this sound reasonable, or are you looking for solving something else here?
Comment 5•2 years ago
|
||
Because of this:
https://searchfox.org/mozilla-central/rev/b580e3f77470b2337bc8ae032b58a85c11e66aba/taskcluster/scripts/tester/test-linux.sh#259
Yeah, so this seems like the bug here. It's assuming that if a checkout exists, then the task is meant to run from that checkout, which clearly isn't the case due to the cache.
Joel's approach sounds reasonable to me and is probably simplest. Maybe a more proper solution would be to remove that line and push the logic that determines which Python to use into the task definitions, but that will be a bit annoying to do as it's only the ubuntu1804-test
image that's using this script atm, so you'd need to add Taskgraph logic that tightly couples to this image somewhere.
Comment 6•2 years ago
|
||
Maybe another angle here, if the task doesn't need a checkout, why does it mount a checkout cache volume in the first place? I'm guessing the run_task
transforms just add it regardless.. IIRC tasks can opt out, but they need to explicitly pass use-caches: false
or something like that. Maybe we can be a bit smarter about automatically determining whether checkout caches are actually needed or not.
Comment hidden (Intermittent Failures Robot) |
Comment 8•2 years ago
|
||
This is where the checkout setup comes from:
https://searchfox.org/mozilla-central/rev/a9cb718ef1502bc0fe5088476748e334fad9d6a1/taskcluster/gecko_taskgraph/transforms/job/mozharness_test.py#188-190
And this comes from bug 1304484.
We'd want to split the setting of the environment variables off support_vcs_checkout.
Comment 9•2 years ago
|
||
The severity field is not set for this bug.
:jmaher, could you have a look please?
For more information, please visit BugBot documentation.
Updated•2 years ago
|
Description
•