Closed
Bug 1441391
Opened 6 years ago
Closed 6 years ago
TRR: Suspended browser can't resolve any names
Categories
(Core :: Networking, defect, P1)
Core
Networking
Tracking
()
RESOLVED
FIXED
mozilla60
Tracking | Status | |
---|---|---|
firefox60 | --- | fixed |
People
(Reporter: valentin, Assigned: bagder)
References
Details
(Whiteboard: [necko-triaged][trr])
Attachments
(1 file)
mcmanus: I came back to a laptop that had been suspended for an hr in mode 3 and couldn't resolve any names.. toggling the mode to 0 back to 3 fixed it.. can you file a bug? (today's nightly)
Assignee | ||
Comment 1•6 years ago
|
||
Hm, that has to somehow have botched the HTTPS requests themselves. It feels like we don't reset state back properly to set it up again when the DOH server's name is not in the cache anymore.
Comment 2•6 years ago
|
||
I have a log of this from overnight. resolving host foo no usable address in cache for host [foo] trrlookup:: foo service not enabled and this is mode 3, so we're stuck (but in any other mode I suspect we wouldn't be using trr at that point) so why won't the service get re-enabled?
Comment 3•6 years ago
|
||
I had this happen on my desktop today for the first time.. now the internet was unusually flaky, so its certainly possible some kind of failed connection was the common thread
Assignee | ||
Comment 4•6 years ago
|
||
Changing mode doesn't in itself trigger anything, but it will make the regular resolver to get used. As that then helped mode 3 could imply that it needed an address to get added to the DNS cache first and then it worked? Do you have bootstrapAddress set? If you get it stuck again like that, can you see if mode 2 or 1 also gets it back on track? Presumably they do. Of course, if the connection is so bad that the HTTP requests fail, then that could explain it as well as then the NS confirm might fail and you end up stranded similar to how you describe. But that seems implausible as it would require a *really* bad connection situation. (PS: a subject for more thinking is certainly what TRR can do to inform exactly why it doesn't work/behave. I've personally managed to fill in the URL wrong, forget to set "useGET" etc and when doing so TRR is just silent and it is far from obvious to a user why it isn't working correctly... your case is yet another version of "TRR doesn't do anything, why?")
Comment 5•6 years ago
|
||
bootstrap is indeed set. this is a bit worse than not doing anything - it was doing something and then stopped and got stuck there. Its certainly possible that it tried to do the NS confirm when coming back from suspend and that failed due to interface coming up wonkery.. but I think we would need to be robust to that.. and it doesn't really explain the desktop issue where the NS should have already been checked and I don't see any reason that it would do it again.
Comment 6•6 years ago
|
||
daniel and I explored this a bit: * various things reset the dns service upon problems like a network change * that would reset the trr service too * normally the NS check is gated on the cap-portal green light.. but * in only mode that is bypassed because cap-portal has a dns dependency * if the ns check fails due to connectivity it stays perma failed until the service is reset * failure is possible in this scenarios because cap-portal hasn't confirmed anything * resetting the mode forces the check to be redone, that's why things work. the fix is to set a backoff timer upon ns check failing and mode = only and try it again. that's basically what cap-port would do.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 8•6 years ago
|
||
My suggested patch here adds a retry mechanism and adds/removes some log output to help future diagnosing what's going on...
Assignee | ||
Updated•6 years ago
|
Assignee: valentin.gosu → daniel
Priority: P2 → P1
Reporter | ||
Comment 9•6 years ago
|
||
mozreview-review |
Comment on attachment 8957475 [details] bug 1441391 - TRR: restart failed NS confirms in TRR-only mode https://reviewboard.mozilla.org/r/226380/#review232292 ::: netwerk/dns/TRR.cpp (Diff revision 1) > > NS_IMETHODIMP > TRR::Notify(nsITimer *aTimer) > { > if (aTimer == mTimeout) { > - LOG(("TRR request for %s timed out\n", mHost.get())); Did you mean to remove this? ::: netwerk/dns/TRRService.cpp:531 (Diff revision 1) > + if ((mConfirmationState == CONFIRM_FAILED) && (mMode == MODE_TRRONLY)) { > + // in TRR-only mode; retry failed confirmations > + NS_NewTimerWithCallback(getter_AddRefs(mRetryConfirmTimer), > + this, mRetryConfirmInterval, > + nsITimer::TYPE_ONE_SHOT); > + if (mRetryConfirmInterval < 64000) { Should we reset the interval when confirmation succeeds?
Attachment #8957475 -
Flags: review?(valentin.gosu) → review+
Assignee | ||
Comment 10•6 years ago
|
||
mozreview-review-reply |
Comment on attachment 8957475 [details] bug 1441391 - TRR: restart failed NS confirms in TRR-only mode https://reviewboard.mozilla.org/r/226380/#review232292 > Did you mean to remove this? Yes, it's on purpose. This log output is not helpful and in fact mostly quite spammy. There's already another log output if the timeout actually cancels the HTTP channel, which is what will interest log readers. > Should we reset the interval when confirmation succeeds? Yes, good catch!
Comment hidden (mozreview-request) |
Comment 12•6 years ago
|
||
Pushed by daniel@haxx.se: https://hg.mozilla.org/integration/autoland/rev/798a47cd74d5 TRR: restart failed NS confirms in TRR-only mode r=valentin
Comment 13•6 years ago
|
||
Backed out changeset 798a47cd74d5 (bug 1441391) for build bustages. CLOSED TREE Log of the failure: https://treeherder.mozilla.org/logviewer.html#?job_id=166968944&repo=autoland&lineNumber=26817 Backout: https://hg.mozilla.org/integration/autoland/rev/1be797a51dbdd9b19d82e0aa0ba8be214c8771a2 Push that got backed out: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=798a47cd74d5e9e37d23935c2b03b0b5d58213f7&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified
Flags: needinfo?(daniel)
Assignee | ||
Comment 14•6 years ago
|
||
I hate those compiler errors that 'mach build' on my machine don't show... :-/
Flags: needinfo?(daniel)
Comment hidden (mozreview-request) |
Comment 16•6 years ago
|
||
Pushed by daniel@haxx.se: https://hg.mozilla.org/integration/autoland/rev/558353d9fc61 TRR: restart failed NS confirms in TRR-only mode r=valentin
Comment 17•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/558353d9fc61
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
You need to log in
before you can comment on or make changes to this bug.
Description
•