For Android tests remove usage of `/etc/hosts`
Categories
(Testing :: web-platform-tests, defect, P1)
Tracking
(firefox70 fixed)
Tracking | Status | |
---|---|---|
firefox70 | --- | fixed |
People
(Reporter: whimboo, Assigned: whimboo)
References
Details
Attachments
(1 file)
When trying to run the web-platform-tests locally on my Mac I see failures with adb remount
like:
File "/Users/henrik/code/gecko/testing/web-platform/tests/tools/wptrunner/wptrunner/testrunner.py", line 207, in init
self.browser.start(group_metadata=group_metadata, **self.browser_settings)
File "/Users/henrik/code/gecko/testing/web-platform/tests/tools/wptrunner/wptrunner/browsers/fennec.py", line 206, in start
write_hosts_file(self.config, self.runner.device.device)
File "/Users/henrik/code/gecko/testing/web-platform/tests/tools/wptrunner/wptrunner/browsers/fennec.py", line 101, in write_hosts_file
device.remount()
File "/Users/henrik/code/gecko/testing/mozbase/mozdevice/mozdevice/adb.py", line 1751, in remount
raise ADBError("Unable to remount device")
ADBError: Unable to remount device
This is because we run this step again and again for each and every browser start. With bug 1560031 we would only push the file once, but would also require a fair amount of refactoring.
So as quick workaround we could check if pushing the file fails, and if so just run the remount command once.
Assignee | ||
Comment 1•5 years ago
|
||
I got into that state when having a broken Android emulator instance running, which's system partition cannot be remounted. Then I put the MBP into hibernation, and when continuing to work I had a new IP address. Due to different IP we were always trying to push the hosts file to the device, which again is still broken because the system partition was still read-only.
While adb remount
is broken, I was still able to run mount -o rw,remount /
which re-mounted the partition as read-write successfully. So I assume this is a bug in adb. CC'ing Geoff for information.
Given that there is nothing we can do in web-platform nor mozdevice I'm going to close this bug as invalid.
Assignee | ||
Comment 2•5 years ago
|
||
As it turns out there is even a more important problem with the current approach. It's not only that your machine gets a new IP address for each and every network, which then ends-up in the hosts file. But also that Fennec is trying to load the test pages from such an external IP, which causes a crash of the browser due to MOZ_DISABLE_NONLOCAL_CONNECTIONS is set.
It means that the hosts file should only contain references to localhost, and adb reverse ports have to be setup to route the request from the device to the host.
With that approach we also have to write the hosts file only once, so that the reported issue from comment 0 will never appear.
Assignee | ||
Comment 3•5 years ago
|
||
As spoken with James we actually want to get rid of writing the hosts file. This can be done by setting the value of the preference network.dns.localDomains
to all the known domains, and by adding reverse ports from the device to the host for each and every test server.
Here a try build:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dfed576a6f90168326e019dac04e84b7d52248b5
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
Assignee | ||
Comment 5•5 years ago
|
||
The last try build was a bit broken. So here a new one:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=78ed3bfbda094271c4b1bb5ba23316e058fb6227
Assignee | ||
Comment 6•5 years ago
|
||
I checked some of those failures and none of them are actually reproducible on my local MacOS machine with the latest version of the emulator. As such I had a closer look to the generated log files of those failing jobs, and noticed that jobs in CI do NOT use the same emulator as we get installed locally, which means there are possible situations when we cannot reproduce a failure that easily on a local machine. I thought that isn't the case when I was listening to Nick's Android bootcamp session - where he stated that it should be made easy for developers to run tests locally and which behave the same as in automation.
Geoff, why are we using a different kind of emulator in CI, but cannot make use of it locally because it's only downloadable from internal sites?
https://taskcluster-artifacts.net/L6M9Q8LyTmaqE0hFVVR_Dw/0/public/logs/live_backing.log
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - [
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - {
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - "algorithm": "sha512",
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - "visibility": "internal",
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - "filename": "android-sdk_r28.0.25.0-linux-x86emu.tar.gz",
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - "unpack": true,
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - "digest": "e62acc91f41ccef65a4937a2672fcb56362e9946b806bacc25854035b57d5bd2d525a9c7d660a643ab6381ae2e3b660be7fea70e302ed314c4b07880b2328e18",
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - "size": 241459387
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - }
[task 2019-06-28T01:13:17.215Z] 01:13:17 INFO - ]
Comment 7•5 years ago
|
||
fwiw, the claim I usually make is that the same avd(s) (the same version(s) of Android) are available in mach as in CI.
Using the same emulator is tricky:
- we support local mach testing for android on Linux and MacOS (with some interest in Windows too!); CI is Ubuntu 16.04 only
- in CI, we use different emulators: an older arm emulator for the Android 4.3 tests, the latest 29.0.11 x86_64 emulator for most Android 7.0 tests, and a slightly older, 28.0.25 x86_64 emulator for tests that have trouble with the latest version (see bug 1556058); locally, things get confusing if you try to install more than one emulator at a time
- the android sdk license does not allow us to redistribute the emulator, so we use an internal-only tooltool artifact in CI; local mach bootstrap downloads the latest emulator from google
Even if you do get the same host OS, emulator, and avd, in my experience tests often do not behave exactly the same as in CI. The emulator is influenced by the host hardware and system libraries. Thankfully we have 'try'!
Comment 8•5 years ago
|
||
(In reply to Geoff Brown [:gbrown] (pto July 8-12) from comment #7)
fwiw, the claim I usually make is that the same avd(s) (the same version(s) of Android) are available in mach as in CI.
I will adopt this language myself. Sorry to mislead, Henrik, and thanks for uncovering a thorny problem in the wpt runner.
A new try push that includes the latest emulator update from Bug 1563766:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f95eda8b0d41aad91f337c41fcd951428336605d
Assignee | ||
Comment 11•5 years ago
|
||
Thanks Maja for pushing this new try build! It looks very promising. Nearly all the failures seem to be known which only leaves the following for further investigation why it's unexpectedly pass and not timeout anymore with my patch:
TEST-UNEXPECTED-PASS | /html/semantics/embedded-content/the-iframe-element/iframe_sandbox_navigation_download_allow_downloads_without_user_activation.sub.tentative.html | Navigation resulted download in sandbox is allowed by allow-downloads-without-user-activation. - expected TIMEOUT
Assignee | ||
Comment 12•5 years ago
•
|
||
Actually the expected timeout was added by bug 1563766 which switched the web-platform-tests over to the new emulator recently. So whatever happened there it looks like that we can revert this change.
Here a new try build:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=263053fac0d3bb1e088c1eb168e482bc294c1def
Assignee | ||
Comment 13•5 years ago
|
||
Updated try for another unexpected fail caused by a recent change in wpt-15:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=aae97d943e5a6a08ff75d6145c1a6ccb9a0bff3a
Comment 14•5 years ago
|
||
Comment 17•5 years ago
|
||
bugherder |
Description
•