DNS: temporary errors (remote DNS servers unavailable) are misreported

NEW
Unassigned

Status

()

Core
Networking
P5
normal
17 years ago
2 months ago

People

(Reporter: D. J. Bernstein, Unassigned)

Tracking

(Depends on: 1 bug)

Trunk
mozilla1.0.1
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [necko-would-take], URL)

(Reporter)

Description

17 years ago
With Mozilla 0.7 under UNIX, the above URL produces error 804b001e, just
like a nonexistent host: ``hardtoreach.cr.yp.to could not be found.
Please check the name and try again.''

Wrong, wrong, wrong. The host exists. You're simply having trouble
reaching it. DNS understands this; why doesn't Mozilla?

Temporary DNS errors are just like TCP connection timeouts. They are
caused by temporary outages and overloads. You are deceiving the user
when you tell him to check the name.

What happens when the user's local network briefly goes down? If his DNS
cache happens to know the address, he'll have a TCP connection timeout.
Otherwise he'll have a temporary DNS error. The error message should be
the same either way: the server is temporarily unreachable.

Comment 1

17 years ago

*** This bug has been marked as a duplicate of 53967 ***
Status: UNCONFIRMED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → DUPLICATE
(Reporter)

Comment 2

17 years ago
No, this is not a duplicate of 53967. Learn to read.
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---

Comment 3

17 years ago
Reporter please see the first description for bug 53967.  The problem you are
having is identical to that bug.

Also, please be kind to the bug triagers.

*** This bug has been marked as a duplicate of 53967 ***
Status: UNCONFIRMED → RESOLVED
Last Resolved: 17 years ago17 years ago
Resolution: --- → DUPLICATE

Comment 4

17 years ago
Agree on duplicate.
Verifying.
Status: RESOLVED → VERIFIED
(Reporter)

Comment 5

17 years ago
Once again: This is not a duplicate of 53967. This is about the bogus
error messages reported to the user for all FQDNs when his network
connection fails. 53967 is a low-level bug that applies only to FQDNs
with CNAME records.
Status: VERIFIED → UNCONFIRMED
Resolution: DUPLICATE → ---

Comment 6

17 years ago
DNS -> gordon
Is there a way to reproduce this easily? Like a fake dns that will always be
down and trigger this bad error report?
Assignee: neeti → gordon
Fabian: that will probably work.
How about an error: "DNS error" or something like that so the user knows he has 
a DNS error.
Reporter: Are you saying that this bug is about when you're DNS server fails, 
or goes down and you can't connect to any site the error message is wrong? If 
so then this is definately not a duplicate. If not, then can you please give a 
longer explanation *in english* of what you are talking about?

Comment 10

17 years ago
jg@cyberstorm:~$ host hardtoreach.cr.yp.to
[ nothing ]
jg@cyberstorm:~$ host cr.yp.to
cr.yp.to                A       131.193.178.181

On clicking the URL, mozilla does not report any failures at all, it just stops
trying to load the page after a while.

Could be related to bug 46537 - reporter can you take a look and report your
opinion here please?

It appears that the nameservers for cr.yp.to don't hold a record for the
'hardtoreach' host, and Mozilla silently ignores the problem, giving the user no
notice of failure. Reporter - is this what you are talking about? Sorry, not
many of us are technically qualified networking people, and thus this bug is a
little difficult to pin down during triage; thanks for bearing with us though :)

Setting qawanted to hit another radar.

Keywords: qawanted

Comment 11

17 years ago
Ben, can you help us here?

Comment 12

17 years ago
I'll look at all the references tomorrow. For now, I'd like to get to the point:

What is the specific test case? Is this the state where a DNS lookup to a server
failed because the DNS server failed? I think your test case is:

STEPS:

Put a server into the DNS search list that doens't exist (or turn off
BIND/unplug DNS server).
Surf.
Observe error. 
Find that error needs improvement.
Status: UNCONFIRMED → NEW
Ever confirmed: true

Updated

17 years ago
Target Milestone: --- → mozilla0.9.1

Updated

17 years ago
Whiteboard: [DNS]

Comment 13

17 years ago
re: qawanted keyword

If we get some kind of clearly defined problem description, I'll take QA of this
bug as benc@netscape.com.

I think D.J. and James are describing two problems:

D.J.: we should error and provide different suggestions.
James: we are failing to connnect and not erroring.

Comment 14

17 years ago
can wait till 0.9.2
Target Milestone: mozilla0.9.1 → mozilla0.9.2
(Reporter)

Comment 15

17 years ago
You're making a novice programming mistake: treating temporary DNS
errors as if they were permanent. Temporary DNS errors should be treated
just like temporary failures to connect to an HTTP server.

Example 1: I supplied the URL http://hardtoreach.cr.yp.to in the
original problem report. Try giving that URL to Mozilla. (Isn't this
why we're reporting URLs?) You will see the message

   hardtoreach.cr.yp.to could not be found. Please check the name and
   try again.

just as if you had used http://nonexistent.cr.yp.to. That message is
wrong. The correct message is

   The operation timed out when attempting to contact
   hardtoreach.cr.yp.to.

just as if you had used http://1.2.3.4.

Example 2: If you unplug your own network connection, and try to connect
to aol.com, you might get a DNS error like http://hardtoreach.cr.yp.to,
or (if the aol.com address is in the local DNS cache) you might get an
HTTP connection failure like http://1.2.3.4. Either way, it's stupid to
tell to the user to check the aol.com name.

Comment 16

17 years ago
mass move, v2.
qa to me.
QA Contact: tever → benc

Comment 17

17 years ago
I've given this discussion some additional thought since it seems to lack
direction. I think the general state of this error is acceptable, in contrast to
error problems we have elsewhere in Necko.

I think that what is needed is DNS failures to respond need errors that
highlight the likeliness that it is a temporary problem.

There should be two basic types of DNS errors reported:

1- Connection failed: The host is not in DNS.
2- Connection failed: DNS is not working right now. Try again later.

I do not think that TCP connection failures should be represented to users the
same way, because firewalls and routers do a variety of weird things these days.

That should probably be a discussion in a separate bug.
Keywords: qawanted

Updated

17 years ago
Priority: -- → P4

Updated

17 years ago
Target Milestone: mozilla0.9.2 → mozilla0.9.3

Updated

17 years ago
Target Milestone: mozilla0.9.3 → mozilla1.0

Comment 18

16 years ago
Bugs targeted at mozilla1.0 without the mozilla1.0 keyword moved to mozilla1.0.1 
(you can query for this string to delete spam or retrieve the list of bugs I've 
moved)
Target Milestone: mozilla1.0 → mozilla1.0.1

Comment 19

15 years ago
Unless I'm mistaken, all the basic error messages discussed here work correctly now.

I test them regularly in:
http://www.mozilla.org/quality/networking/testing/coretests.html

The only really bad error is that if the DNS server is not responding, we say
"hostname could not be found", bug 164715.
Status: NEW → RESOLVED
Last Resolved: 17 years ago15 years ago
Resolution: --- → WORKSFORME

Updated

15 years ago
Summary: Temporary DNS errors are misreported → DNS: temporary errors are misreported

Comment 20

15 years ago
REOPEN: (to go to wontfix)

I think I've finally figured out what really going on here.

The temproary error is the authoritative DNS server being unavailable for A
records that are uncached elsewhere.

> hardtoreach.cr.yp.to
Server:  ns1.nscp.aoltw.net
Address:  10.169.8.5

Non-authoritative answer:
hardtoreach.cr.yp.to    nameserver = ns.hardtoreach.cr.yp.to

Authoritative answers can be found from:
hardtoreach.cr.yp.to    nameserver = ns.hardtoreach.cr.yp.to
ns.hardtoreach.cr.yp.to internet address = 131.193.178.248       

This error is also not nearly as inaccurate as initially suggested, because it
is vauge, but true "HOSTNAME could not be found (in DNS)". This message does not
say that the entry does or does not exist. nslookup says "Non-existent
host/domain", which is even more vague, because the error is "I couldn't find
the domain for the hostname I'm lookng for."

From a technical perspective, making this distinction is desirable, but not
possible, based on gordon's comments in Bug 164715 #7, because NSPR treates all
errors the same.

From a realistic perspective, if your domain information is not available
because you have a lot of nameserve downtime, you need to peer w/ someone that
has better uptime, and/or register w/ an ISP that will provide reliable
secondary service. 
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Summary: DNS: temporary errors are misreported → DNS: temporary errors (remove DNS servers unavailable) are misreported

Updated

15 years ago
Summary: DNS: temporary errors (remove DNS servers unavailable) are misreported → DNS: temporary errors (remote DNS servers unavailable) are misreported

Comment 21

14 years ago
this depends on handling more DNS errors via NSPR.
Depends on: 196076
Whiteboard: [DNS]

Comment 22

10 years ago
(In reply to comment #21)
> this depends on handling more DNS errors via NSPR.
> 

I stumbled upon this bug while fiddling with nsIDNSService, which indeed returns only one error (NS_ERROR_UNKNOWN_HOST) whatever the actual DNS failure reason is, and is quite limited, making it unusable for the purpose of writing a decent DNS client.

Your comment seems to suggest some indepth hacking is required as to have such a better DNS service in Mozilla.
So (given the age of this bug), were any progress made enhancing NSPR, making it possible to write a better nsIDNSService?
If not, are there any (possibly documented?) plans to do so?

Apologies if this was not the right place to ask.

Thanks! :)

Comment 23

10 years ago
Oliver, have you done any further research?

Only bug 393372 and bug 196076 mention DNS - and there is no progress in those bugs.
Assignee: gordon → nobody
Status: REOPENED → NEW
QA Contact: benc → networking

Comment 24

10 years ago
Wayne: yes and no.
If I understood things well:
 * there is no current plan to enhance the DNS components (admitedly, there is little use for the browser - fine grain DNS error reporting is probably meaningless for most users).
 * if one wants more/better right now, one has to code it, and that (probably) means hacking NSPR...

Sure, better DNS handling is desirable... and there are (IIRC) a pair of (somewhat related) issues with DNS failure handling that would benefit from that (WPAD for eg, in some condition lead to bad freezes with kinky DNS servers configurations) - not too sure about the bug #s...

Comment 25

10 years ago
This probably can't be fixed unless NSPR is fixed... my recollection is that the DNS service is basically passing the failure code back from NSPR, since it looks to the OS for "failed" or "here it is!".

We have lots of other higher-level DNS problems to deal with, but yes, in a perfect world, I'd still like us to have better DNS error handling than everyone else. After a 5 year hiatus, I was a bit surprised to see this bug still open!
Whiteboard: [necko-would-take]
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P4 → P5
You need to log in before you can comment on or make changes to this bug.