Closed Bug 1498525 Opened Last year Closed Last year

Permafail in captive portal test with trr enabled

Categories

(Core :: Networking: DNS, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox64 --- fixed

People

(Reporter: jkt, Assigned: bagder)

References

Details

(Whiteboard: [necko-triaged][trr])

Attachments

(1 file)

See the following try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=59d274a9b6252103cdf9534eef359865c98a842e

The following test is always failing: netwerk/test/unit/test_captive_portal_service.js
Do I just have to set trr.mode to 2 and trr.url to something sensible when running this test to trigger this?
Assignee: nobody → daniel
Priority: -- → P1
Whiteboard: [necko-triaged][trr]
Valentin, any immediate thoughts on why this test would fail if TRR mode 2 is enabled by default?
Flags: needinfo?(valentin.gosu)
(In reply to Daniel Stenberg [:bagder] from comment #1)
> Do I just have to set trr.mode to 2 and trr.url to something sensible when
> running this test to trigger this?

Yup I set them to: https://hg.mozilla.org/try/rev/c764b0be4214f0db4a113cbcf89e2fede0691429
(In reply to Jonathan Kingston [:jkt] from comment #3)
> (In reply to Daniel Stenberg [:bagder] from comment #1)
> > Do I just have to set trr.mode to 2 and trr.url to something sensible when
> > running this test to trigger this?
> 
> Yup I set them to:
> https://hg.mozilla.org/try/rev/c764b0be4214f0db4a113cbcf89e2fede0691429

The crash is:
PID 5933 | FATAL ERROR: Non-local network connections are disabled and a connection attempt to mozilla.cloudflare-dns.com (104.16.111.25) was made.
PROCESS-CRASH | netwerk/test/unit/test_captive_portal_service.js | application crashed [@ mozilla::net::nsSocketTransport::InitiateSocket()]

We can't use cloudflare in our unit-tests, we have to set it to something local.
When I used localhost as the TRR URI (TRR doesn't work, obviously), the test passes.
However, I'm not sure why this is the only xpcshell-test with this problem.
Flags: needinfo?(valentin.gosu)
(In reply to Valentin Gosu [:valentin] from comment #4)
> However, I'm not sure why this is the only xpcshell-test with this problem.

I assume it's because the captive portal service is actually disabled for unit tests.
Regarding the pref... we could add mozilla.cloudflare-dns.com to the network.dns.localDomains pref - that would make it not crash. Then it's up to us if we also want to instantiate a proper DoH server for this test or not.
Thanks Valentin, of course that's why!

Another approach is to use:

  prefs.setBoolPref("network.dns.native-is-localhost", true);

... which converts all native name resolves internally to instead resolve "localhost". It will then simply make TRR not find a working DoH server if it even gets that far.

Whether we should have an actual DoH server for this test or not is an interesting question, but I think short-term we're better off not using one and instead we make sure that it works "like before". Longer term we could extend the test to also verify the same things when a DoH server is up and working etc.
By setting network.dns.native-is-localhost in this test, it makes all native
name resolves use "localhost" and thus avoids causing an assert if this test
runs with TRR enabled and network.trr.uri set to point to an actual external
host name.

MozReview-Commit-ID: D1df6VtfckR
I could add that with this patch, the test works with and without the changes :jkt did.
Pushed by daniel@haxx.se:
https://hg.mozilla.org/integration/autoland/rev/9305a99ea1c3
make test_captive_portal_service use localhost only r=valentin
https://hg.mozilla.org/mozilla-central/rev/9305a99ea1c3
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
You need to log in before you can comment on or make changes to this bug.