Open
Bug 956267
Opened 10 years ago
Updated 2 years ago
localhost DNS lookup always fails if $HOME empty
Categories
(Core :: Networking, defect, P3)
Tracking
()
NEW
People
(Reporter: jidanni, Unassigned)
Details
(Whiteboard: [necko-backlog])
Attachments
(2 files)
User Agent: Mozilla/5.0 (X11; Linux i686; rv:28.0) Gecko/20100101 Firefox/28.0 Iceweasel/28.0a2 (Beta/Release) Build ID: 20131215004001 Steps to reproduce: Do the following test. Disconnect your computer from the network and in /etc/hosts.conf put 127.0.0.1 abj.jidanni.org 127.0.0.1 radioscanningtw.jidanni.org 127.0.0.1 transgender-taiwan.org 127.0.0.1 mysql.transgender-taiwan.org and have apache serve them. And restart e.g., dnsmasq, firefox, etc. Well chromium and w3m work fine. But for firefox, transgender-taiwan.org doesn't work! That's right, all the other virtual sites we created work except that one. Why? Because firefox finds that there is only two components, and insists on adding WWW. in front before trying it AT ALL! (You need to be offline to see this happening.) Setting browser.fixup.alternate.prefix to "" fixes it. This is terrible --- not one little beep over HTTP is ever attempted. Please check for the version of the URL the user has asked for before stuffing the WWW. on in BOTH online *AND FOR LOCALHOST* situations.
Reporter | ||
Comment 1•10 years ago
|
||
On localhost, $ firefox http://X.Y/ from the command line is also turned into http://WWW.X.Y/ before EVEN TESTING FOR X.Y ! Q.X.Y and Q.R.X.Y etc. are immune, because they have 3+ components.
Severity: normal → major
Component: Untriaged → Location Bar
Priority: -- → P2
Comment 2•10 years ago
|
||
Not sure where this belongs, Document Navigation or Networking?
Component: Location Bar → Document Navigation
Product: Firefox → Core
Comment 3•10 years ago
|
||
Document Navigation is correct, since the relevant code is nsDefaultURIFixup::MakeAlternateURI, called by nsDefaultURIFixup::CreateFixupURI, which probably comes from this snippet in nsDocShell::EndPageLoad:
>7013 // If the page load failed, then deal with the error condition...
>7014 // Errors are handled as follows:
>7015 // 1. Check to see if it's a file not found error or bad content
>7016 // encoding error.
>7017 // 2. Send the URI to a keyword server (if enabled)
>7018 // 3. If the error was DNS failure, then add www and .com to the URI
>7019 // (if appropriate).
>7020 // 4. Throw an error dialog box...
Comment 4•10 years ago
|
||
My mistake, it's probably nsDocShell::LoadURI: http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#4281
Comment 5•10 years ago
|
||
The callsite pointed to in comment 4 passes either 0 or FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP to CreateFixupURI. But browser.fixup.alternate.prefix is only considered if the FIXUP_FLAGS_MAKE_ALTERNATE_URI flag is set. That flag is set in the code pointed to in comment 3, but that code would only run if we failed to resolve the hostname transgender-taiwan.org. Note that this is purely about DNS resolution; I would not expect any HTTP traffic before this failure case if it happens at all. I see no other places that call createFixupURI with FIXUP_FLAGS_MAKE_ALTERNATE_URI in our tree. So it sure sounds like we're not finding an IP address for that hostname. Reporter, could you please create a DNS resolution log for a minimal Firefox session that shows the problem for you using the steps at https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging but with NSPR_LOG_MODULES set to nsHostResolver:5 and then attach that log to this bug?
Flags: needinfo?(jidanni)
Reporter | ||
Comment 6•10 years ago
|
||
And wouldn't you know it, I can't reproduce it at all today, so I'll close this bug for now while I try different ways to reproduce it. When it was happening, tcpflow -i lo showed nothing, so you are right, it was all at the DNS level.
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Flags: needinfo?(jidanni)
Resolution: --- → INVALID
Reporter | ||
Comment 7•10 years ago
|
||
Real results! Detach your computer from the internet. After boot, put 127.0.0.1 abj.jidanni.org in /etc/hosts, and do /etc/init.d/dnsmasq stop or whatever, to prevent any DNS interference, and have an apache2 server listening for that virtual host there. # cat script export NSPR_LOG_MODULES=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5 export NSPR_LOG_FILE=/tmp/log.$$.txt export HOME=/tmp/F mkdir -p $HOME firefox http://abj.jidanni.org/ # chmod +x script # su - nobody -c $PWD/script #super pristine NOW on the first run with an empty $HOME, firefox consistently CANNOT find abj.jidanni.org. Now quit firefox with CTRL+Q. ON second and subsequent runs, it ALWAYS finds abj.jidanni.org ! (Which only holds for our pristine case, not all cases... Anyway, the above you should be able to reproduce. iceweasel: 28.0~a2+20131215004001-1) So we see it probably looks IN its cache before looking FOR its cache, or something! (So how about the fuss I made about "www." ? Well in each case after I had saved preferences I had thus already made the directory tree and thus was not reproducing the above exact test... or something.) P.S., On the aforementioned second AND subsequent runs, OTHER sites that we have listed there in our /etc/hosts.conf still cannot be found when we type them into the URL bar. If we do # su - nobody -c "HOME=/tmp/F firefox http://other.site.com/" to connect to that running firefox, they SOMETIMES can be found... maybe there is a race condition in that case, here on my 2005 vintage computer.
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
Summary: even before ever checking for real server, browser.fixup.alternate.prefix is applied, preventing any contact with the real server, when on localhost → localhost DNS lookup always fails if $HOME empty
Reporter | ||
Comment 8•10 years ago
|
||
Reporter | ||
Comment 9•10 years ago
|
||
Comment 10•10 years ago
|
||
Comment on attachment 8355845 [details]
log.5965.txt - FAILURE on first run
Here's the difference:
2014-01-05 04:35:46.306378 UTC - -1219873024[b722e480]: Resolving host [abj.jidanni.org].
2014-01-05 04:35:46.306398 UTC - -1219873024[b722e480]: No usable address in cache for [abj.jidanni.org]
2014-01-05 04:35:46.306406 UTC - -1219873024[b722e480]: DNS thread counters: total=1 any-live=0 idle=1 pending=1
2014-01-05 04:35:46.306415 UTC - -1219873024[b722e480]: DNS lookup for host [abj.jidanni.org] blocking pending 'getaddrinfo' query: callback [a5651880]
2014-01-05 04:35:46.306458 UTC - -1501562048[a6e58040]: DNS lookup thread - Calling getaddrinfo for host [abj.jidanni.org].
2014-01-05 04:35:46.306722 UTC - -1501562048[a6e58040]: DNS lookup thread - lookup completed for host [abj.jidanni.org]: failure: unknown host.
2014-01-05 04:35:56.114034 UTC - -1220450560[b722e480]: Resolving host [abj.jidanni.org].
2014-01-05 04:35:56.114051 UTC - -1220450560[b722e480]: No usable address in cache for [abj.jidanni.org]
2014-01-05 04:35:56.114103 UTC - -1220450560[b722e480]: DNS thread counters: total=1 any-live=0 idle=0 pending=1
2014-01-05 04:35:56.114114 UTC - -1220450560[b722e480]: DNS lookup for host [abj.jidanni.org] blocking pending 'getaddrinfo' query: callback [a8afd880]
... later ...
2014-01-05 04:35:56.114652 UTC - -1308624064[b722f800]: Resolving host [abj.jidanni.org].
2014-01-05 04:35:56.114661 UTC - -1308624064[b722f800]: Host [abj.jidanni.org] is being resolved. Appending callback [a8afd9d0].
2014-01-05 04:35:56.114670 UTC - -1308624064[b722f800]: advancing to STATE_RESOLVING
2014-01-05 04:35:56.118032 UTC - -1493181632[a8aa5600]: DNS lookup thread - starting execution.
2014-01-05 04:35:56.124224 UTC - -1493181632[a8aa5600]: DNS lookup thread - Calling getaddrinfo for host [abj.jidanni.org].
2014-01-05 04:35:56.124554 UTC - -1493181632[a8aa5600]: DNS lookup thread - lookup completed for host [abj.jidanni.org]: success.
Not really sure what to make of the flaky DNS results.
Comment 11•10 years ago
|
||
Sounds like the system getaddrinfo call is just returning different things in the two cases, no? As in, the problem is in whatever implements getaddrinfo, not in our code...
Updated•10 years ago
|
Component: Document Navigation → Networking
Reporter | ||
Comment 12•10 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #11) Super easy to reproduce: Unplug network, add line to /etc/hosts, run above script.
Reporter | ||
Comment 13•10 years ago
|
||
...that is I can totally control if the bug will appear or not: { rm -r /tmp/F; run script; run script again;} repeat as many times as you need. Each first run will have the bug, each second run won't.
Comment 14•10 years ago
|
||
Boris, do you know if this is our bug? Seems like by comment 11 that you think it might not be.
Severity: major → normal
Flags: needinfo?(bzbarsky)
Comment 15•10 years ago
|
||
I strongly doubt this is our bug, but someone needs to investigate to make sure...
Flags: needinfo?(bzbarsky)
Comment 16•10 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #15) > I strongly doubt this is our bug, but someone needs to investigate to make > sure... Need-info to Lukas to find someone to work on this.
Flags: needinfo?(lsblakk)
Comment 17•10 years ago
|
||
Jason - you're on the peer list for Core:Networking - is there anything you can see in this case that would help us clarify where the bug lies (in our code or not)?
Flags: needinfo?(lsblakk) → needinfo?(jduell.mcbugs)
Comment 18•10 years ago
|
||
Would also be useful to know how far back this reproduces.
Keywords: regressionwindow-wanted
Comment 19•10 years ago
|
||
I'm not convinced that we need to be working hard to find a window or an assignee for this; the situation described here is pretty esoteric.
Comment 20•10 years ago
|
||
(In reply to Josh Matthews [:jdm] from comment #19) > I'm not convinced that we need to be working hard to find a window or an > assignee for this; the situation described here is pretty esoteric. I agree, it's definitely not something we'd track for release. Up to Jason to prioritize this in his workflow or pass it to someone else better suited to investigate.
Comment 21•10 years ago
|
||
Yes, my guess here is that Chrome uses it's own DNS resolver, while we use the OS'es, and there's something idiosyncratic about the DNS setup on Dan's box (or distro). I don't think I want to put hours into this unless/until we get some evidence that this is a widespread issue. It's very unlikely this is something we can easily within mozilla code (short of shipping our own DNS resolver, which we'll probably do one of these days).
Flags: needinfo?(jduell.mcbugs)
Comment 22•10 years ago
|
||
So should I resolve this as WONTFIX until such time that we decide to implement our own DNS resolver?
Keywords: regressionwindow-wanted
Comment 23•10 years ago
|
||
(In reply to Anthony Hughes, QA Mentor (:ashughes) from comment #22) > So should I resolve this as WONTFIX until such time that we decide to > implement our own DNS resolver? don't close it - it could well be a firefox issue.. or maybe not. its simply not clear. If it were - I would definitely take a patch for if it were fully investigated. Just because it doesn't rate on the priority list for full timers (and I agree we've spent too much time talking about it) doesn't mean we wouldn't take the contribution from someone scratching their itch. its quite plausible that this is tied up in the online/offline triggering state-resetting work that :bagder and :sworkman are starting this quarter.
Reporter | ||
Comment 24•10 years ago
|
||
Did any of you run my test? It only takes a half a minute to confirm the bug.
Comment 25•10 years ago
|
||
For now, I'll mark this NEW since it's been replicated, even though we aren't sure it's Firefox's bug or the OS's DNS setup.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Updated•8 years ago
|
Whiteboard: [necko-backlog]
Comment 26•7 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P2 → P1
Comment 27•7 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•