Investigate running tests on Windows / arm64
Categories
(Testing :: General, enhancement, P1)
Tracking
(Not tracked)
People
(Reporter: gbrown, Assigned: gbrown)
References
Details
Attachments
(3 files, 10 obsolete files)
We have firefox builds for Windows aarch64. Can we run tests against them? Let's try running the tests we normally run against Windows 10 builds in continuous integration on a Lenovo Yoga and see what happens...
Assignee | ||
Comment 1•6 years ago
|
||
I thought it would be convenient to run in a MozillaBuild environment: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Windows_Prerequisites#Getting_the_source
Recent releases of MozillaBuild won't install on the Yoga, because they are x86_64 only. But MozillaBuild 2.2.0 does install -- I'm using that. https://ftp.mozilla.org/pub/mozilla/libraries/win32/MozillaBuildSetup-2.2.0.exe
Assignee | ||
Comment 2•6 years ago
|
||
It sounds like most (all) folks build for aarch64 on x86 machines -- not on the Yoga. So running in a mach build context is tricky. Instead, let's concentrate on running from test zips....
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 3•6 years ago
|
||
A quick and dirty first attempt. This runs in the MozillaBuild environment, kicking off the mozharness script correctly. On my first try, desktop_unittest failed because the screen resolution could not be set.
Assignee | ||
Comment 4•6 years ago
|
||
The screen resolution issue is related to scaling. By default, the Yoga display settings have "Change the size of text, apps, and other items" as 150%. With this setting, our code for checking screen resolution reports incorrect (scaled) values, causing a fatal error. Change the setting to 100% to workaround.
Assignee | ||
Comment 5•6 years ago
|
||
Next problem I encountered: powershell is called from mozharness, but powershell is not in the PATH of the MozillaBuild environment.
Assignee | ||
Comment 6•6 years ago
|
||
Worth noting that there is a considerable delay in/after fetch_url_into_memory that might feel like a hang - just give it a few minutes. I think this is just the effect of limited memory or a slow file system during archive extraction; warrants more investigation, eventually.
Assignee | ||
Comment 7•6 years ago
|
||
With powershell in PATH, on my next run the mochitest test harness started firefox; there was a crash and a system alert that the firewall had blocked some features of python.exe; I clicked to allow; later manifests ran mochitests - progress!
Comment 8•6 years ago
|
||
excellent, I believe we turn those prompts off in automation with the automated setup; I do this manually while running locally all too often.
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 9•6 years ago
|
||
...now a little more general, but doesn't handle all suites yet.
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 10•6 years ago
|
||
Assignee | ||
Comment 11•6 years ago
|
||
I have completed an initial run of cppunit, gtest, and mochitest-plain (all chunks) so far. Most logs have some failures, but most tests pass. There are some crashes, so I'm not sure we are completing all runs, but it looks like run-times are reasonable: maybe 20% slower than existing Windows 10 test tasks seen on treeherder.
Comment 12•6 years ago
|
||
thanks :gbrown. I would be eager to see if mochitest-media and reftests will run. So far this is not sounding too scary once we get core tooling setup.
Assignee | ||
Comment 13•6 years ago
|
||
Assignee | ||
Comment 14•6 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #12)
thanks :gbrown. I would be eager to see if mochitest-media and reftests will run. So far this is not sounding too scary once we get core tooling setup.
Those - mochitest-media and reftest - look okay, more or less. I see about 30 unexpected failures in reftest chunk 1, for example.
Comment 15•6 years ago
•
|
||
I tried the following:
mochitest-chrome
Tests are started, with the screen resolution being altered to 1280x1024. However, each test under widget/tests/ have status of TEST-SKIP.
There is a failure caused by the step Running manifest: browser/base/content/test/chrome/chrome.ini
.
web-platform
Could not get tests to run - dependencies are installed but tests encounter a critical failure at metadata path.
Comment 16•6 years ago
|
||
:egao, can you run the tests that gbrown ran to ensure that you are have a working environment and scripts are setup correctly.
Assignee | ||
Comment 17•6 years ago
|
||
(In reply to Edwin Gao (:egao) from comment #15)
I tried the following:
mochitest-chrome
Tests are started, with the screen resolution being altered to 1280x1024. However, each test under widget/tests/ have status of TEST-SKIP.
There are a lot of skipped tests in widget/tests:
https://searchfox.org/mozilla-central/source/widget/tests/chrome.ini
but there should be some that run. For instance, widget/tests/test_bug1151186.html runs and passes for me.
There is a failure caused by the step
Running manifest: browser/base/content/test/chrome/chrome.ini
.
I don't see that failure in my mochitest-chrome run: I have 4 passing tests on that path.
There might be variation depending on the build. I used task id MEP5IaDDRvagYX5ZN9sMzw.
web-platform
Could not get tests to run - dependencies are installed but tests encounter a critical failure at metadata path.
I haven't successfully run wpt yet either.
Assignee | ||
Comment 18•6 years ago
|
||
This has updates for wpt, but I have not run wpt successfully yet.
Assignee | ||
Comment 19•6 years ago
|
||
I found an issue with wpt: If I use my script to run any web-platform suite once, then run any web-platform suite again, the second run fails because it cannot delete build/tests/web-platform/tests/fonts/Ahem.ttf. I assume the file is locked because the font is installed? I imagine we would not see this in CI because we reboot between runs...or maybe the Windows worker makes special allowance for this? My workaround: reboot between runs.
Comment 20•6 years ago
|
||
I have confirmed that we terminate an instance in aws when we run wpt on windows10- so this theory of a reboot holds true.
Comment 21•6 years ago
•
|
||
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #16)
:egao, can you run the tests that gbrown ran to ensure that you are have a working environment and scripts are setup correctly.
Definitely. I have gone ahead and created a matrix using Google Sheets that aims to corroborate results that :gbrown is seeing. This way I can also check if the issues are due to my environment, setup, or the task ID chosen.
(In reply to Geoff Brown [:gbrown] from comment #17)
There are a lot of skipped tests in widget/tests:
https://searchfox.org/mozilla-central/source/widget/tests/chrome.ini
but there should be some that run. For instance, widget/tests/test_bug1151186.html runs and passes for me.
Perhaps my difficulties are due to choosing a bad build. I ran mochitest-chrome
with both your task ID and my selected task ID. Guess which one worked and which did not. I apparently chose a win64/pgo
build which might explain my failures.
The Google Sheet is here.
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 22•6 years ago
|
||
(In reply to Edwin Gao (:egao) from comment #21)
The Google Sheet is here.
Thanks Edwin. Now that we have builds synced up, we seem to get basically the same results (mochitest-plain seems to have an oddly variable number of passes, but I see that too from one run to another).
web-platform tests need some work: there may still be environment or harness problems or wpt.
Otherwise, tests can be run, but we seem to see a lot of crashes. I'd like to verify that we still crash with a recent build (we've been testing with a build from Jan 16).
Comment 23•6 years ago
|
||
Bug 1512822 just landed on central so you may want to test with the next nightly after this post, just in case any of the previous test failures were compiler-related.
Assignee | ||
Comment 24•6 years ago
|
||
(In reply to Geoff Brown [:gbrown] (pto Jan 28-30) from comment #19)
I found an issue with wpt: If I use my script to run any web-platform suite once, then run any web-platform suite again, the second run fails because it cannot delete build/tests/web-platform/tests/fonts/Ahem.ttf. I assume the file is locked because the font is installed? I imagine we would not see this in CI because we reboot between runs...or maybe the Windows worker makes special allowance for this? My workaround: reboot between runs.
My font-locking problem goes away with the patch from bug 1522696 - that makes sense! - so reboots are no longer required.
Assignee | ||
Comment 25•6 years ago
|
||
...now updated to run raptor-tp6-1
Comment 26•6 years ago
|
||
If this works, we should change the worker-type to something more permanent.
This is using a worker-group intened for testing generic-worker itself.
Comment 27•6 years ago
|
||
Comment 28•6 years ago
|
||
Added jittest to the list.
Comment 29•6 years ago
|
||
Assignee | ||
Comment 30•6 years ago
|
||
Assignee | ||
Comment 31•6 years ago
|
||
Now updated for all the raptor suites and all the talos suites run on Windows 10 x64. I have not tested all of these -- let me know if you find problems!
(In reply to Geoff Brown [:gbrown] from comment #31)
Created attachment 9046793 [details]
script for running mozharness on YogaNow updated for all the raptor suites and all the talos suites run on Windows 10 x64. I have not tested all of these -- let me know if you find problems!
This has been working tremendously well, Geoff, and has saved me a ton of effort, mistakes, and frustration -- thanks again for it!
Nothing official yet in this tests-on-arm64 readiness tracking spreadsheet, but it's going well (apart from seemingly spurious Firefox + Windows 10 crashes (they're not often, but they are frequent). I've also taken/been pushed a few Windows 10 updates (the latest of which is their "October Update"), so hopefully that helps.
Next steps (given priorities/time) are probably to start pushing CI-config (yaml?) changes to Try, and start gathering data from larger-scale test runs?
Comment 33•6 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #32)
Next steps (given priorities/time) are probably to start pushing CI-config (yaml?) changes to Try, and start gathering data from larger-scale test runs?
We should file bugs for all failures, and either fix or disable them. If disabled, the bugs should remain open until the root cause is identified and resolved.
Assignee | ||
Comment 34•6 years ago
|
||
The script attachment here does all that I wanted in this bug: allows simple local runs of virtually all test suites. If anyone needs additional capabilities, please needinfo me.
But now it's generally easier to use try (disabled by default currently, until we get more capacity), and egao is using that to identify and report test failures.
Assignee | ||
Comment 35•6 years ago
|
||
The old one bitrotted -- updated!
Comment 36•5 years ago
•
|
||
Geoff, mind taking a look at adding (at least a subset of?) awsy-test to this convenience wrapper script? For context, I've been running and working on awsy-tp6 (sy-tp6 in Treeherder), in support of bug 1567138.
A sample of a successful run, today, without invoking via your script, looks like (run from c:/mozilla-build
):
python.exe mozharness/scripts/awsy_script.py --cfg mozharness/configs/awsy/taskcluster_windows_config.py --test-packages-url https://q ueue.taskcluster.net/v1/task/VcvAL16PQxChDJ9BK8g8qQ/runs/0/artifacts/public/build/target.test_packages.json --installer-url https://queu e.taskcluster.net/v1/task/VcvAL16PQxChDJ9BK8g8qQ/runs/0/artifacts/public/build/target.zip --download-symbols ondemand --tp6
I'm not well-versed on the various arguments one can pass, etc., but the relevant first entrypoint is https://wiki.mozilla.org/Project_Fission/Memory#AWSY_.28tp6.29.
Assignee | ||
Comment 37•5 years ago
|
||
Added support for awsy, awsy-base, awsy-tp6.
(In reply to Geoff Brown [:gbrown] from comment #37)
Created attachment 9091960 [details]
script for running mozharness on YogaAdded support for awsy, awsy-base, awsy-tp6.
Thanks; these additions (particularly awsy-tp6) have been serving me well. Really appreciate the quick turnaround!
Description
•