Closed Bug 34943 Opened 20 years ago Closed 18 years ago

Domain guessing (automatic www. and .com) shouldn't happen on links/URLs

Categories

(Core :: Document Navigation, defect, P3)

defect

Tracking

()

VERIFIED FIXED
mozilla1.0

People

(Reporter: jruderman, Assigned: adamlock)

References

()

Details

(Keywords: topembed+, Whiteboard: [ADT2 RTM] security)

Attachments

(4 files, 1 obsolete file)

Mozilla should only add the automatic "www." and ".com" to urls when the user
types the URL in, not when the user clicks a link such as http://mozilla/ .

This is a potential security issue: suppose you have a computer named chameleon
on your network and log into it through the web using http://chameleon/* .
Someone manages to DoS chameleon and also gets ahold of the chameleon.com domain
name.  Next time you try to log into chameleon using your local copy of the
login form, they get your password.  Then they go to
http://chameleon.network.net/ and log in using your password (unless chameleon
is selective in which IPs it allows to log in).
Problem confirmed with 2000-04-06-10-M15 on WinNT.

Adding junruh@netscape.com to Cc: list -- this does look like a security
issue to me as well, and one that would not necessarily need so complicated
a scenario to be problematic.

While nobody else could reproduce the problem, bug 17657 reports a problem
where only the "www." ".com" version was ever visited.
good catch. ->travis
Assignee: gagan → travis
Should be fixed now...
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
2000 042112, and still happens.  Reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This works correctly with 20000524 win32/linux/mac builds. Clicking on a 
link (e.g., http://blues/ ) takes the browser to the local domain host 'blues'.
Putting back fixed.
Status: REOPENED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → FIXED
reopening (build 2000 060820 win98)

http://mozilla/ takes me to http://www.mozilla.com/ .
http://blues/ takes me to http://www.blues.com/, which redirects to another 
page.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
-> jud
Assignee: travis → valeski
Status: REOPENED → NEW
Target Milestone: --- → M19
Similarly, a link to http://cmc.edu/ should fail but the same URL typed into 
the location bar should work.
And another one: file:///foo/bar/ -> file://///foo/bar/ , which is making bug 
38643 (a win9x crasher) happen more often than it would otherwise.
http://www.mozilla.org:/ (from bug 31854) probably shouldn't work as a link.
Adding a dependency on bug 60768 which deals with moving the fixup code into a 
separate service.
Depends on: 60768
Target Milestone: M19 → mozilla1.0
Blocks: 63736
Ccing mstoltz and nominating for nsbeta1.

Bug 65421 might be a dup.
Keywords: nsbeta1
Looking...
Status: NEW → ASSIGNED
Jesse's attack scenario is a bit implausible - an attacker would need to a) know
the name of a machine on your intranet that might contain interesting
information, and b) own the equivalent domain name in .com. Can anyone come up
with a more plausible scenario?

Maybe we should have a pref for "assume www.*.com" or "never assume." 
Ok, maybe it's not that big of a security hole.  It's still a bug though.

I don't think a pref is needed: this type of URL correction should never happen 
on links (or form submission, or going to a bookmark), and if you type the URL 
into the location bar you're likely to notice when it changes.
The W3C thinks there *should* be a pref for turning off www. / .com guessing 
for URLs typed into the location bar (I said in my previous comment that I 
didn't think a pref would be necessary).

http://www.w3.org/TR/2001/NOTE-cuap-20010206#cp-keyword
1.6 Allow the user to override any mechanism for guessing URIs or keywords. 
Many user agents compensate for incomplete URIs by applying a series of 
transformations with the hope of creating a URI that works. For example, many 
user agents transform the string www.w3.org into the URI http://www.w3.org/. 
The user should be able to control whether, for example, typing a keyword 
should invoke a Web search or whether the user agent should prepend http://www. 
and append .org/.
The W3C CUAP's suggestion of having a pref to always turn url-fixing off is now 
bug 68407.
bug 65421 should be added to the list of bugs that this blocks.

I saw this when I clicked a link to "http://primates/~vladimir" on slashdot.
Someone from ximian had erroneously posted an internal URL. When I clicked the
link, I got to a netscape search on "primates." I thought this was someone's
idea of a practical joke, but when I looked back at the link, I found it was
this bug...

FYI I have internet keywords turned on.
A link like http:/cgi-bin/man2html?locate+1l (note the single '/' after 'http:')
should be interpreted as relative to the hostname in the URL of the page.

Example: I'm reading
  http://localhost/cgi-bin/man2html/usr/share/man/man1/xargs.1.gz
and I click on the link above; Navigator4.76, lynx and w3m load
  http://localhost/cgi-bin/man2html?locate+1l,
Mozilla (Build ID: 2001031005; Linux) loads
  http://www.cgi-bin.com/man2html?locate+1l
if internet keywords are disabled, and
  http://search.netscape.com/cgi-bin/search?charset=UTF-8&search=cgi-bin
if internet key are enabled.


With this bug, man2html it's unusable with Mozilla, because it generates only
reltive links!
This kind of relative urls is deprecated with rfc2396 and we decided to no
longer support it. 
Marcello, Andreas, isn't that what bug 65421 is exactly about?
Open Networking bugs, qa=tever -> qa to me.
QA Contact: tever → benc
re: Michael

Its a security hole for two reasons:

1- Even if a domain is not actively using a web server @ www.<hostname>.com, a
wildcard DNS record could be used to send all web requests somewhere.

2- Anytime this goes out some place to the wrong machine, it could carry
information in the URL to a remote site. (This is the same reason I object to
the internet kewords feature being hooked up as a secondary handler between
hostname resolution and posting DNS errors...)
Good points, Ben. Necko-level security needs a serious look in general.

If your last comment was directed at me, the name is Mitchell, not Michael.
sorry. too many bugs, not enough sleep...
OS: Windows 98 → All
Hardware: PC → All
As of build 2001061404 win32 installer sea trunk
this seems to be fixed except that Mozilla doesn't give any error messages for
invalid links.
Since DNS cache is not going to support negative entries, fixing this would do a
lot for overall DNS performance in real-world situations.

There is also and RFE for even more liberalized hostname -> Top Level Domain
searching (bug 37867), and I think that would worsen the problem further, so I'm
blocking that bug with this bug.
Blocks: 37867
> Jesse's attack scenario is a bit implausible - an attacker would need to a)
> know the name of a machine on your intranet that might contain interesting
> information, and b) own the equivalent domain name in .com. Can anyone come up
> with a more plausible scenario?

Given that many "words" under .com are already registered and hooked up, it is
not unlikely that there will be a corresponding <http://www.jazz.com> for your
internal <http://jazz/>. So, if you try to log into an internal server and it
happens to be down, your login information might leak into the public internet
to a "random" site. Such a case could appear accidently, without any "attacker",
nevertheless confidental information has been revealed.

Note: I am behind a Squid proxy and I don't see this bug with 0.9.1.
reseting owner to defaul
Assignee: valeski → neeti
Status: ASSIGNED → NEW
Blocks: 104166
Why does Mozilla try to add "www."? Why does it need to? Wouldn't the easiest
way to fix this simply be to disable the www-adding?

Personally I find the automatic www-thing very annoying. If a machine in a local
network called 'whatever' is down, why would I want Mozilla to try
www.whatever.com? It's a waste of bandwith, and almost all important sites work
without www. anyway.
*** Bug 115539 has been marked as a duplicate of this bug. ***
Will fixing bug 115539 also fix this?
Probably.
Marking dependent on bug 115539.
Depends on: 115539
auto adding www. or .com is plain weird. Btw we're not all in a .com world. What
about .org, .net, .dk, .info... etc
related to 115539
Assignee: neeti → adamlock
Component: Networking → Embedding: Docshell
QA Contact: benc → adamlock
*** Bug 115539 has been marked as a duplicate of this bug. ***
The www. & .com appending should be a pref but I don't see it as a security issue. 

The code in nsDefaultURIFixup::MakeAlternateURI doesn't do anything if the URL
contains user or password information.
Uh, not a security issue?

debian.org => www.debian.org.com -- ok it's just Debian but it could be
something you don't want people to know

internalserver => www.internalserver.com -- this one should never leave the
internal network, but instead it goes straight to the external DNS servers. With
a bit of luck (hah), internalserver.com is owned by someone you don't like. And
now they know an URL within your internal network.

I'm sure more creative people could think of even better examples. Look up the
jargon file entry for DWIM to see why guessing and acting on behalf of the user
based on that guessing is Evil and Rude.
Security: also consider the typo www.creditcardcom (missing dot)
Mozilla will expand this to www.creditcardcom.com.  Now imagine
if this exists and is a lookalike for the creditcard site.
You could easily give it personal information before you
noticed. Badness.

Lest this seems far fetched I know for a fact that at least one
famous site has a framing and linking copycat site which does exactly
this.   See my comments for bug 125871.

To me, this means that mozilla should NEVER mangle a url, whether
typed in or from a clicked link.  Cesar's comments about DWIM are
spot on.
I do see this as a security issue (spoofing), and we can argue as to the
severity, and risk vs. the inconvenience of removing this feature that people
ahve come to expect. The URL bar is generally permissive of a wide range of
input errors. Any thoughts on the security/functionality tradeoff here?
The only thing I've come to expect is the automatic prepending of http:// .
That's a huge time saver. I believe lots of people here agree with me.

Adding www. and .com only sometimes (depending on the particular relative
timings of the DNS servers, the resolver timeout, and the phase of the moon)
violates the principle of least susprise. Like when I (on lynx) tried to load
"testing.debian.org" and got a default Debian Apache page -- only to find out
after chatting with the Debian developers that it had misautoguessed something
like "www.testing.debian.org.org". Which some kind soul set up to point to
127.0.0.1. I was seeing a page on my own box -- something I would not expect.

Of course, now I have disabled it on lynx.cfg...
I'm convinced. There have been several dupes showing porn sites which take
advantage of this. I'm for fixing this, but I will try to get some more opinions
around here.
To clarify the current behaviour:

if there are no dots in the name (e.g. foo)
  return www.foo.com
else if there is one dot in the name
  if it starts with www. (e.g. www.foo)
    return www.foo.com
  else (e.g. foo.org)
    return www.foo.org
else
  do nothing

So debian.org should become www.debian.org (assuming debian.org doesn't exist)

I can add a pref e.g. "fixup.alternate" plus prefs for the strings to stick on
the front and the back.
Just get rid of it.
Attached patch PatchSplinter Review
Patch makes fixup configurable via prefs. The UI for this & default prefs can
be determined in another patch.
If bug 115539 is going to remain a duplicate of this one, then this bug's
summary should be updated to include typed URLs, in addition to links.
Otherwise, 115539 should remain a separate bug.
Attachment #70713 - Flags: review+
Comment on attachment 70713 [details] [diff] [review]
Patch

r=brade
Keywords: topembed
+nsbeta1 - I'm dogfooding on this bug now.

#47 - I've separated the two bugs by reopening the dupe...

I've modified the summary to make more sense, and use the term "domain guessing"
for this behavior that seems to have no name...

Location is user input, which is a legitimate area for some type of shortcut and
domain guessing behavior (w/ a dozen or so good bugs filed against it).

We don't do this for https: apparently (a from the patch's code snippet says:)

 // Code only works for http. Not for any other protocol including https!

Domain guessing in links is really bad because you don't know what the author's
intent was. I just spent all night sending requests to www.<SERVER>.com, because
I clicked on links that had hostnames. This revealed a bunch of URI's to whoever
is running those systems. The authors will go unnamed, but they should be very
glad they didn't have URL's like "http://server/project/januaryshipdates.html".

I think it is important to understand what the level of confusion involved here.
I understand that this person understands their system is named "system", so
they make links like "http://system/" and think their work is done.

However, this person has already ignored several messages from me to literalize
just what he mean to point to with his links, I think this is reasonable proof
that another netscape employee should fix this problem for that person's sake.

Also, you get situations like an email message that actually had a link that
said: "my server is HERE" (with http://server.mcom.com as the HREF), but then
also puts at the bottom of the message "http://server.mcom.com" (with the HREF
of http://server/) !!!

Fixing this would easily minimize the security leaks some publishers inflict
upon themselves, as well as save users the frustration of staring at the
throbber as DNS tries to find something that matches one of the incorrect guesses.

4xp does this, but I think we should override the legacy behavior for better
security.

Expanding localhost is another example of where we have problems.
--- 

The only form of hostname -> expanded domain I think is reasonable is bug 88217,
because that actually reflects a user's attempt to make these things go away.

Also, the debian examples might be partially explained by bug 40082, which
depends on bug 124565.

Keywords: nsbeta1
Summary: automatic www. and .com shouldn't happen on links → Domain guessing (automatic www. and .com) shouldn't happen on links/URLs
Adam, I didn't fully scan through benc's comments but if I'm reading it
correctly it sounds like he has some concerns with the behavior with this patch.
Is that right, if so can you address those? If not, email me again for a super
review and I'll take a look at the patch.  
Keywords: nsbeta1, topembednsbeta1+
Whiteboard: ADT2
Whiteboard: ADT2 → [ADT2]
Uhm, what's holding back the patch?
I'll resubmit the patch for sr. I know it doesn't address all the issues raised
by this bug so I won't close the bug. But the patch will allow people to disable
the feature entirely if they want for time being.
Comment on attachment 70713 [details] [diff] [review]
Patch

Do we already have default pref values for: 
"browser.fixup.alternate.enabled"
and
"browser.fixup.alternate.prefix"
and
"browser.fixup.alternate.suffix"

If not, can we add defaults?

everything else looks good. sr=mscott
Attachment #70713 - Flags: superreview+
Attached patch Pref diffSplinter Review
Changes to all.js to add defaults for these prefs.
Comment on attachment 70713 [details] [diff] [review]
Patch

a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #70713 - Flags: approval+
> +pref("browser.fixup.alternate.enabled", true);

Shouldn't this be

  +pref("browser.fixup.alternate.enabled", false);

instead?
First patch is checked in. The prefs mean the behaviour is the same as before
but at least you can disable it now.

Next patch will address how best to do fixup such that URIs in the address bar
are fixed up if loading fails but no link clicks.
> The prefs mean the behaviour is the same as before
> but at least you can disable it now.

That's what the prefs mean, yes, but why do you want it to be enabled by
default? I strongly believe this should default to disabled, at the very least
until bug 66183 is fixed.
I also strongly agree that the default should be off.
Read the above comments especially those mentioning security concerns.

It's also hard to see where this "feature" would help anyone these days,
with the proliferation of non .com urls.  It's more likely to baffle
than help, especially relative internet newbies.
Someone file a new bug and provide the relevant one-line patch to have it turned
off by default. Then, we can have the debate.

Gerv
Oops... this is still open. Still, I think a new bug might be better. One bug,
one issue.

Gerv
I've reopened bug 115539 to debate whether browser.fixup.alternate.enabled
should be true or false by default.
Fix turns out to be straightforward - only do fixup for LOAD_NORMAL loadtypes.
Don't do fixup for any other type, e.g. LOAD_LINK.

Reviews please.
I did a quick LXR search
<http://lxr.mozilla.org/seamonkey/search?string=LOAD_NORMAL> and it seems that
LODA_NORMAL is used in other cases than the urlbar, i.e. for image loading. It
should only *ever* happen in the urlbar, everything else is a bug (possibly
security-relevant).
The LOAD_NORMAL I'm testing for is a docshell flag. Are you sure you're not
confusing it with nsIRequest::LOAD_NORMAL?
No, I'm not sure.
LOAD_NORMAL is a combination of flags for the docshell and has nothing to do
with the LOAD_NORMAL flag on the nsIRequest interface.

By adding this test I am able to restrict URI fixup to normal docshell load on a
document which means fixup will not happen on pages loaded from cache, from
history, from a link click and various other combinations. Inline content such
as images, css, iframes etc should not be fixed up either.

I will attach a testcase to demonstrate.
Attached file Test case
Some HTML with some duff links that shouldn't be fixed once the patch is
applied.
Updated patch handles frames & iframes too
Attachment #76004 - Attachment is obsolete: true
Blocks: 34934
No longer blocks: 34934
qa to me.
QA Contact: adamlock → benc
r=radha (after conferring with Adam)
*** Bug 133056 has been marked as a duplicate of this bug. ***
Comment on attachment 76059 [details] [diff] [review]
Updated patch to disable fixup on link click

sr=rpotts@netscape.com
Attachment #76059 - Flags: superreview+
Comment on attachment 76059 [details] [diff] [review]
Updated patch to disable fixup on link click

a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #76059 - Flags: approval+
Requesting adt1.0.0+.

Risk summary - low. Patch is self contained and reasonably obvious. It
suppresses www. & .com fixup for link clicks & subframes. 
Keywords: adt1.0.0
Keywords: topembed+
adt1.0.0+ (on ADT's behalf) for checkin into 1.0.
Keywords: adt1.0.0adt1.0.0+
Fix is checked in.
Status: NEW → RESOLVED
Closed: 20 years ago18 years ago
Resolution: --- → FIXED
The idea test for this would check every URL entry point, the full list of which
I do not have in my head.

Does anyone have the HTML expertise to tell me if there are additional tags
besides  what is covered in the testcase?

TIA.
benc: for a list of URL entry points (?) that often misbehave, skim bug 55237
and bug 69070 (at least bug 55237 comment 0 and bug 69070 comment 24).  I'd also
test frames, iframes, "view image" (context menu), "save link target as", and
"open link in new window".  If you're bored, test all of those for referrer and
checkloaduri as well :)
VERIFIED:
Linux 2002-04-09-08
MacOSX 2002-04-10-08
Win32 200-04-09-03

jruderman: yikes. that's a lot of stuff. I took QA of the Domain guessing bug
because they are related to DNS.

I'm not going to (at this time) hunt around all the places people call Necko for
URLs and figure this out. I should note that Open Tab and Open New Window seem
to have inconsistent DNS error handling/Domain Guessing behaviors as well.
Status: RESOLVED → VERIFIED
Was this fixed on the 1.0 branch? If yes, then pls mark this bug as fixed1.0.0,
so QA can verify the fix.
Blocks: 143047
Whiteboard: [ADT2] → [ADT2 RTM]
This looks like it was fixed pre-branch, based on my comments. However, at least
one person has reported a regression: bug 147119.
*** Bug 65421 has been marked as a duplicate of this bug. ***
As in comment #5
Try this:
click on this link http://blues/ then middle-click on the link again.
If you midle-click, the host blues is tryied, and then www.blues.com is tryied.
I don't know if the midle-click issue goes in this bug too.
It seems that the midle-click ends up here:
http://lxr.mozilla.org/mozilla1.0/source/docshell/base/nsDefaultURIFixup.cpp#208

I have setup the midle-click to open the link in a new tab in the background.

Used: Mozilla 1.0.1 RC1 on WinXP
Oliver: see bug 159742, "URI fixup happens after 'open link in new window' as if
the address was typed by hand".
Thanks Jesse,
I'm going to bug #159742 since this bug is closed.
*** Bug 31335 has been marked as a duplicate of this bug. ***
Whiteboard: [ADT2 RTM] → [ADT2 RTM] security
Why is the status of this bug "VERIFIED FIXED"? The issue is alive and Bug 159742 is still "NEW".
You need to log in before you can comment on or make changes to this bug.