Closed Bug 70871 Opened 24 years ago Closed 9 years ago

file://hostname/dir/dir/filename - hostname not implemented

Categories

(Core :: Networking: File, defect)

x86
All
defect
Not set
minor

Tracking

()

RESOLVED WONTFIX
Future

People

(Reporter: chimera, Unassigned)

References

(Depends on 1 open bug, )

Details

(Keywords: platform-parity, polish)

Attachments

(1 file)

After a bit of head-scratching, it appears that the server name in the URL

file://server/dir/dir/filename is completely ignored. If an incorrect server is
added, no error messages are displayed. This bug is related to bug 65102 and
60321 and others (query for file://).

An access to an URL file://whatever/hello.world
is interpreted as file:///hello.world

It seems a misunderstanding of the URL structure of file:// occurs often and
while it is a human mistake, Mozilla could facilitate the understanding of the
process that occurs when an URL gets parsed.

This results in

1) Misunderstanding that results in bugs 65102 and 60321.
2) People might access file://usr/bin but get file:///bin instead.
3) If you add a bookmark URL "file://" You get a list of no-name directories
when viewed using the Bookmark Managers (either in the toolbar, sidebar or
window). This is definitely a minor bug.

Solution: Not sure.
Is the server component ever used with the file:// protocol? If not, then maybe
an error message should be displayed whenever server is not an empty string.
Otherwise, an error message should be displayed whenever an incorrect server is
used.
Over to networking:file
Assignee: asa → dougt
Component: Browser-General → Networking: File
QA Contact: doronr → tever
Target Milestone: --- → Future
Confirmed
Platform: PC
OS: Linux 2.2.17
Mozilla Build: 2001032108

Marking NEW.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: polish
qa to me.

What is meant by "servername?" The first field after "file://" should be the
name of the system, the DNS name (and fqdn) or "localhost" or nothhing (the
"file:///" prefix).

File URL's implicitly should not work on any system except from the FQDN'd
system. This is very distinct from most uses of the word server, where people
would think it's an NFS, AFS, AFP or other lan based file server named in the URL.

Assuming we are reading the FQDN and ignoring it if it doesn't match the system
the browser is running on, we have a general lack of robust error handling as well.
QA Contact: tever → benc
I'm calling the server anything in the first field after "file://"
You're right, "file://mozilla.org/pub" doesn't make sense, therefore I believe
there should be one of two behaviours
1) If the first field after file:// is not empty, return an error message

2) If the first field after file:// is not-empty and the servername isn't
localhost then return an error message. This is the current behaviour of
Netscape 4.x

But regardless, file://usr/bin should return an error message
I think that is where we are headed in other bugs that discuss file URL issues...

Is there any aspect of this we need to discuss further?
Nope. Not as far as I can see.
We should not ignore the string betwween the 2nd and 3rd slash, it should be 
doing a weak validation of the file URL. 

This is part of RFC 1738:

fileurl        = "file://" [ host | "localhost" ] "/" fpath

Okay, to close up my file URL test case re-write...
ideally:
file://mozilla.org/pub - would only work if your machine thinks it IS 
"mozilla.org"

file:// is not empty - like "file://thing" is invalid and probably not worth 
mapping to "file://thing/" and then showing the local drives if the hostname 
matches (as described in the previous examples).

file://usr/bin - only works if you name you machine "usr".

As a final note: files mounted via NFS should still work, but since NFS 
mountpoints are transparent, you should just use the full path. there has been 
given some consideration for "nfs:" as a URL scheme as well.
Summary: file://server/dir/dir/filename - server is ignored → file://hostname/dir/dir/filename - hostname not implemented
Blocks: 102724
*** Bug 65102 has been marked as a duplicate of this bug. ***
Blocks: 109982
OS: Linux → All
+mozilla 1.0 - marking for standards compliance.
Keywords: mozilla1.0
+pp: IE and Chimera had better implementations of file, so this should be looked
at again.
Keywords: mozilla1.0pp
+nsbeta1 - we should be doing some kind of name check.

The exact check (hostname, reverse DNS lookup, etc) I'm not sure of, I'll look
for some relevant bugs.
Wouldn'it make sense to consider, at least on the Windows platform, the part
after // to be a machine name? IE seems to handle file: this way, e.g.:

file://machine/volume/abc.html -> \\machine\volume\abc.html
Hmmm ... there were several issues with file urls mainly because of fixup versus
what the user really meant. That is the general problem in this case. The file
code went through several iterations. Earlier we assumed that the user when
typing something like file://something/somethingelse really meant
file:///something/somethingelse, because file is a local protocol, except when
that "something" was "localhost" or looked like a drive on windows/OS2. That
fixup was done in the urlparser. Later that code was removed and "something"
stayed as host. It is now up to the platform specific implementation of
nsILocalFile to decide what to do with it. I think Unix currently ignores the
host, on windows if it looks like a drive it is pushed into the path. 

Perhaps we can have something on that level (nsLocalFile) that checks if the
hostname is localhost or is the hostname of the local machine (fully qualified
or not). If it is not the case, push the host into the path and then convert the
url into a local path.

In case of badly written UNC *paths* like file://machine/volume/abc.html we
would  first change this to file:///machine/volume/abc.html if maschine is not
the name of the local maschine and then (although that code is currently not
active or already removed) because "machine" does not look like a drive change
it to file://///machine/volume/abc.html to make it an UNC path which would
result in a local file \\machine\volume\abc.html.

However if I remember correctly that fixup code was deactivated because of
performance issues with WIN95/98(?) with incorrect server names which resulted
in lockups.
The file://///hostname/dir/dir/file works when the html file (containing the
href) is retrieved via the file protocol, but not when the file is retrieved
using the http protocol.

It alway works when the URL is typed in.

IE doesn't have problems.
Becuase of this I can't use mozilla/netscape on the intranet.

Sjaak
I confirm this behavior described in Comment #15.

The URLs starting with "file://" are working when the file that contains this
href is also retrieved via "file://". The Links are not working when the file
that contains the hrefs was retriebed via "http://". The links are also working,
when typed into address-bar of mozilla or firebird.

Additional to the presented examples in the attachment (id=132924) appended in
Comment #15 from Sjaak van Schie the same occures with mapped drives (URLs like
"file://c:/index.html" or "file://c|/index.html").

I tested this on Windows XP 
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20030925 Firebird/0.7

and on Linux
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 Firebird/0.7

So the summary of this bug is wrong, because "hostname" is implemented and
working (on windows), but "file"-hrefs don't work in files retrieved over http.
found a solution in bug #84128

user_pref("security.checkloaduri", false);

this enables loading "file://..." from http-files.
Depends on: 84128
I had the same problem after finding a working solution here.

Here is a simple testcase:
<html>
<head>
<base href="http://www.mozilla.org/"/>
</head>
<body>
<a href="file://///servername/sharename/">Link</a>
</body>
</html>

Change "servername/sharename" to a windows (or Samba) server and share name.
If you click on the link, nothing happens.
Take out the <base> element and (if locally opened) it works.

I can confirm that the solution given by Michael Augustin (Comment #17) solves
this problem.
Andreas: I think that the software should work in the case you have described,
but I am pretty certain that checkload.uri is not considering if the BASE comes
from the loaded URL vs. the <BASE> tag.

I think you should a new bug for this specific case.
*** Bug 253186 has been marked as a duplicate of this bug. ***
So... when is the bug mentioned in comment #13 and in bug 253186 going to get fixed?

And the checkuri security feature *has* to provide some user feedback, otherwise
it'll just **** people off.  Writing something to the JavaScript console does
NOT count as feedback, since almost nobody has that open all the time.
I'm working on the intranet of a _very_ big company in Europe as a summer job,
and they have lots of links like
  file://hostname/share/path/to/a/file
or even
  file://hostname/share/path\to\a\file 
(this is obtained by copy/paste from the Explorer address bar next to the
file://hostname/share/ part - not very good but I've seen people do this)

All those links are referencing files on the Windows domain shares (accessible
only with the proper permissions) and work well in IE. 

I changed some of them to the 'workaround' format
   file://///hostname/share/path/to/a/file
that works in Mozilla/Firefox once that the security setting is disabled, but I
obviously don't have full database access to change them all. Even if I did,
people there would continue to use the other format and "broke" them again in
the future.

I urge you to fix this bug before Firefox 1.0 if you want to be able to deploy
Firefox on big Windows-based intranets, since AFAICT, besides some layout
glitches, it is the only thing that would prevent it.
One thing (in my opinion the only thing) that might be possible is to look in
the windows-specific implementation of nsILocalFile if there exists a share with
the name of the hostname and then "fix" the url to the //// version for the file
access.
Would it work on Unix stations in Samba environments for example? Or does this
problem only occurs in Windows where the mapping between Unix and Windows hosts
is not done?
On unix the file protocol will not handle shares at all as far as I know. It is
not supposed to, the file protocol is a local protocol by definition. 
(In reply to comment #25)
> On unix the file protocol will not handle shares at all as far as I know. It is
> not supposed to, the file protocol is a local protocol by definition. 

Yes, but as far as the OS is concerned, shares are just as much part of the
local filesystem as an actual harddisk -- and I think that's true in both Linux
and Windows.
Yes, that's true because already mounted shares look like a part of the local
filesystem, but in this case, and that is the reason for this bug, the file url
with a hostname (<> localhost) is supposed to invoke the network access and that
is beyond the scope of the file-protocol.
So long as we can all agree that it is in fact a bug in Mozilla/Gecko/Firefox,
and that it gets fixed PDQ (certainly before Firefox 1.0), then I'll be happy :)
No, I don't think it's a mozilla bug. Those are broken file urls and it is a
question if we are nice enough to implement a fixup for those broken urls by
guessing the users intentions. The limit, as always, should be if we break a
perfectly valid url with the fixup in the process.
Given that this should at most be a two-line fix ("ooh, the host component of
this URL is not our own hostname, let's shove an extra three slashes in there
just for fun!"), and given that I've yet to see *any* intranet site that uses
five-slashed URLs (willingly, at least).... why is this taking so long?
(In reply to comment #29)
> No, I don't think it's a mozilla bug. Those are broken file urls and it is a
> question if we are nice enough to implement a fixup for those broken urls by
> guessing the users intentions. The limit, as always, should be if we break a
> perfectly valid url with the fixup in the process.

Sorry, but if I read correctly the RFC 1378, these file URLs are not broken at
all (except for the backslashes for my second example, but it is already taken
care under Windows anyway). 

In the RFC I see that file://<host>/<path> should work, not only
file://///<host>/<path> (see bug #253186 which was marked as duplicate of this
one). 

Or do you mean that Windows hosts are not "real" hosts? I agree Windows should
*probably* do the mapping between Unix hosts and Windows hosts by itself, but
it's not going to do that any time soon. If you go there, remember that the
shell:// protocol was "fixed" in Mozilla a few weeks ago, why would the file://
protocol not deserve the same treatment?
Benoit: the SMB based servers not the same kind of hosts.

The meaning of the field is to key a file URL to the system it was created on,
not to be a reference to the filesharing system that that is serving the file:

RFC 1630:

   There is clearly a danger of confusion that a link made to a local
   file should be followed by someone on a different system, with
   unexpected and possibly harmful results.  Therefore, the convention
   is that even a "file" URL is provided with a host part.  This allows
   a client on another system to know that it cannot access the file
   system, or perhaps to use some other local mecahnism to access the
   file.
> perhaps to use some other local mecahnism to access the file.

Such as SMB, perchance?
sure, except the host has to be an IP addressable DNS-host.

using smb server names in this area would create potential conflicts with DNS
namespace.

I think that would be unacceptable. I had a windows system at AOL that could not
access www.mozilla.org because somehow IT locked LMHOST mapping on my DNS, and
some idiot registered their windows system as "mozilla".

What you are suggesting would create the same conflict. The primary purpose of
the field is for file URL's so a machine can "sign" a file URL so it knows it
came from itself. There is no further detail on how your interpretation of
"perhaps" could be implemented.

In all likelihood, the author was talking about using nfs to automount, since
nfs was widely available in the original environments of the time. Later, in RFC
1738, nfs and afs are mentioned as "newly registered schemes". 
Comment #13 talks about a solution that caused problems on Windows 9x. Does this
problem still occurs on Windows 98 now? Windows 95 is not in the supported
platforms list anymore, even if it works there.

If it does, does that lockup only appears when you try to fix the URL, or even
in the case of a "good" link (five slashes) but with an inacessible machine? If
it is the latter, I don't see why Mozilla should'nt try to fix it, since it
would hang even if the URL was "valid".
> I think that would be unacceptable. I had a windows system at AOL 
> that could not access www.mozilla.org because somehow IT locked 
> LMHOST mapping on my DNS, and some idiot registered their windows 
> system as "mozilla".

Such a registration might block you from simply typing "mozilla", hoping to have
it auto-expand to "www.mozilla.org", but there's no way it could possibly block
explicitly entering "mozilla.org".  As soon as you qualify the domain name then
it can't possibly be a LAN address (unless it matches your own domain), so
there's no chance of collision.

ie. a local computer called "mozilla" actually has a FQDN of
"mozilla.yournetwork.com" (for example), although of course it won't be visible
outside your LAN unless your firewall & DNS server are configured to make it
visible.  When a browser encounters a hostname on its own ("mozilla"), it will
first try the LAN, and failing that will then search for it with the most common
DNS suffixes (.com, .org, etc).  This is browser-level functionality -- the DNS
system itself cannot resolve "mozilla", as that's not a valid TLD.  Thus the
only possible way for this to hide mozilla.org is if the user is being lazy.

The simple fact is, I have seen a grand total of 2 links that use the hostname
of file:// to refer to itself in the manner you describe, and several hundred
that use it to refer to an SMB share on the Intranet.  Guess which use seems
more important?  (and I still don't see how either use can preclude the other)
(In reply to comment #0)
> After a bit of head-scratching, it appears that the server name in the URL
> 
> file://server/dir/dir/filename is completely ignored. If an incorrect server is
> added, no error messages are displayed. This bug is related to bug 65102 and
> 60321 and others (query for file://).

Should those two be marked as blocking this bug also then?
Can't we at least do some checking to see if the hostname part is the local
machine, and if it isn't, then return an error?
From: http://www.faqs.org/rfcs/rfc2151.html

7.1. Uniform Resource Locators

   As more and more protocols have become available to identify files,
   archive and server sites, news lists, and other information resources
   on the Internet, it was inevitable that some shorthand would arise to
   make it easier to designate these sources. The common shorthand
   format is called the Uniform Resource Locator. The list below
   provides information on how the URL format should be interpreted for
   the protocols and resources that will be discussed in this document.
   A complete description of the URL format may be found in [4].

  file://host/directory/file-name
       Identifies a specific file. E.g., the file htmlasst in the edu
     directory at host ftp.cs.da would be denoted, using the full URL
     form:  <URL:file://ftp.cs.da/edu/htmlasst>.

So i dont understand where the problem is. File://<hostname> is legal, and
really as said more than once in the comments, is in widespread use. Firefoxes
inability to use these existing links is hurting it in the corporate intranets.
I know our IT dept would love to standardize on it but small things like this
are a huge deterrent when you have massive amounts of preexisting links that
would need to be changed, not to mention all the code that generates links on
the fly.

This has been around for 4 years. Can we please get this fixed.
(In reply to comment #39)
> So i dont understand where the problem is. File://<hostname> is legal, and
> really as said more than once in the comments, is in widespread use. Firefoxes
> inability to use these existing links is hurting it in the corporate
> intranets. I know our IT dept would love to standardize on it but small things
> like this are a huge deterrent when you have massive amounts of preexisting
> links that would need to be changed, not to mention all the code that
> generates links on the fly.

There are still some major problems that make this difficult to fix.
1) Nothing in any specification, other than the behaviour of IE, says that
remote file:// URLs should be interpreted as referring to SMB shares. In fact,
the example you quoted suggests they should be interpreted as FTP URLs.
2) If we do decide to treat them as SMB shares then we have to decide what to do
on non-Windows systems where Samba might not be available or might be hard to
detect/use.
3) We have to fix potential security issues when file:// URLs could link to
remote, possibly malicious content.
4) What we do *still* won't be compatible with IE, at the least because IE
allows \ as a path separator --- even examples in this bug show that --- and
that *clearly* violates standards.
(In reply to comment #40)
> There are still some major problems that make this difficult to fix.
> 1) Nothing in any specification, other than the behaviour of IE, says that
> remote file:// URLs should be interpreted as referring to SMB shares. In fact,
> the example you quoted suggests they should be interpreted as FTP URLs.

I agree completly. In the example "file://ftp.cs.da/edu/htmlasst" if ftp.cs.da
is not the localhost by which protocol should this file be accessed? 

ftp? only a guess because of the filename.
smb? because of what information? 

You have to somehow show what protocol to use. file is local, so use the path.
"file://///ftp.cs.da/edu/htmlasst" is the smb-way.
*** Bug 315003 has been marked as a duplicate of this bug. ***
(In reply to comment #4)
> ... "file://mozilla.org/pub" doesn't make sense....

Since when?  

"file://mozilla.org/pub" used to mean to connect using FTP to the machine "mozilla.org" and get its file/directory "pub".

Daniel
(In reply to comment #4)

> But regardless, file://usr/bin should return an error message

Not if "usr" is a valid host name.

(Unless the "file:" scheme has been changed since it meant to connect to the
named host to try to retrieve a file.)
(In reply to comment #43)
> (In reply to comment #4)
> > ... "file://mozilla.org/pub" doesn't make sense....
> 
> Since when?  
> 
> "file://mozilla.org/pub" used to mean to connect using FTP to the machine
> "mozilla.org" and get its file/directory "pub".

That would be ftp://mozilla.org/pub ...
(In reply to comment #45)
> (In reply to comment #43)
> > (In reply to comment #4)
> > > ... "file://mozilla.org/pub" doesn't make sense....
> > 
> > Since when?  
> > 
> > "file://mozilla.org/pub" used to mean to connect using FTP to the machine
> > "mozilla.org" and get its file/directory "pub".
> 
> That would be ftp://mozilla.org/pub ...

Yes, that is another way of specifying 'using FTP to the machine "mozilla.org" 
and get its file/directory "pub".'

However, what's your point?  (We're talking about the meaning of "file:" 
URLs, not "ftp:" URLs.)

Daniel



 

(In reply to comment #41)

> I agree completly. In the example "file://ftp.cs.da/edu/htmlasst" if ftp.cs.da
> is not the localhost by which protocol should this file be accessed? 
> 
> ftp? only a guess because of the filename.
> smb? because of what information? 

What do the current IETF RFCs say about the "file:" scheme? 
(In reply to comment #46)
> (In reply to comment #45)
> > (In reply to comment #43)
> > > (In reply to comment #4)
> > > > ... "file://mozilla.org/pub" doesn't make sense....
> > > 
> > > Since when?  
> > > 
> > > "file://mozilla.org/pub" used to mean to connect using FTP to the machine
> > > "mozilla.org" and get its file/directory "pub".
> > 
> > That would be ftp://mozilla.org/pub ...
> 
> Yes, that is another way of specifying 'using FTP to the machine "mozilla.org" 
> and get its file/directory "pub".'
> 
> However, what's your point?  (We're talking about the meaning of "file:" 
> URLs, not "ftp:" URLs.)

It seems you want to make an universal network access protocol to files out of the file-protocol. However the file protocol only defines local acces, making the hostname more or less meaningless (can only be localhost or the local hostname). If you want to use ftp than say so in the url, if you want to use something else, please do. If you use file, it is a local access. There is only one exception. UNC-Paths can be used in the path-segment(!) of file urls and designate a file access that may be handled by the OS(!) over a network. Those urls look like file:////server/path or even file:////server/path. As an uri fixup urls like file://server/path might be fixed to file://///server/path.
So, fine, force it to the five slashes if that's the only way you'll tolerate it actually working.  But then make sure that the URLs with only two slashes will then autoconvert to the five-slash format.  There are a *lot* of URLs in the wild with only two slashes, and if they don't work then this will be considered a serious Firefox bug.

I don't see the problem with the two-slash/hostname thing, though.  It just means "using local filesystem options, access this file on the other computer", ie. on Windows systems it should access it as a UNC path.  On Linux/Mac systems, either it should just report it as invalid, or ideally try it as a CIFS/NFS mount (especially if it's one that's already listed in the fstab/mtab).
It doesn't make sense for the interpretation of file: URLs to depend on the client platform.

If you want file: remote hosts to be accessed via SMB, petition IETF to make it so.
Of course it does.  file: URLs are for direct filesystem access, which won't be portable across different platforms (or even across computers in the same platform, unless special measures are taken) anyway.  So "that isn't portable!" isn't a valid argument.
Glad to see this deficiency being discussed again. I, and all the users of my corporate intranet, have been dealing with the repurcussions for years.

Let's quote this useful passage again:

RFC 1630:

   There is clearly a danger of confusion that a link made to a local
   file should be followed by someone on a different system, with
   unexpected and possibly harmful results.  Therefore, the convention
   is that even a "file" URL is provided with a host part.  This allows
   a client on another system to know that it cannot access the file
   system, or perhaps to use some other local mecahnism to access the
   file.

The last phrase is telling. On Windows, the canonical "other local mechanism" is to use a UNC reference. Thus it would be perfectly reasonable, and very useful, to translate file://host/path to \\host\path on Windows, unless host is empty or localhost. By prefixing these names with two backslashes, you're passing the information on to Windows that you "know that it cannot access the (local) file system", and leaving the details of how to interpret the hostname (and the following path) to the OS.

SMB doesn't directly enter it; nowhere will you find it specified that interpreting UNCs necessarily involves SMB. This approach is safe or unsafe according to the local network security configurations. In fact, such a URI interpretation will typically be safer than treating one with host name that is absent or "localhost" as a local file reference. 

We use Firefox for Windows on our intranet for many reasons, because the good things about it outweigh the bad. In fact, this problem with file URIs is one of very, very few bad things in Firefox. Had this been fixed shortly after this bug was opened, it would have saved me many hours of trying to explain to my users why existing file URIs wouldn't work for them, and then explaining to them (and reminding them again, over and over) how to write a "correct" file URI according to the restrictive definition of "correct" adopted by the Firefox developers. To put this in perspective -- I'm not even a member of any internal network management team. The only internal system that I manage is a small wiki for internal development discussions. I can only managed what fits this problem gives people that actually support large internal corporate system.

Please, please, put this particular issue to bed. Your cautious interpretation of the standards is arguably correct, but the standards are loose on this point, and yours is not the *only* correct interpretation possible.

(In reply to comment #52)

> I can only managed what fits this problem
> gives people that actually support large internal corporate system.

I meant to say "I can only imagine ..." here. Sorry about the typo.

There's a related paragraph over in RFC 1738:
   The file URL scheme is used to designate files accessible on a
   particular host computer. This scheme, unlike most other URL schemes,
   does not designate a resource that is universally accessible over the
   Internet.

In other words, they're not intended to be portable to everyone.  They're intended to be used in very tight controlled circumstances, which typically means for the local system only or for files on the local intranet.  The latter case is the interesting one, and the one which Firefox does not currently support correctly.

The RFC goes on to state "The file URL scheme is unusual in that it does not specify an Internet protocol or access method for such files", meaning that the user agent is free to use whatever means are at its disposal to talk to the computer with the given hostname and get it to return a file that satisfies the path (as interpreted by that foreign computer).  On Windows, the most reasonable way to do this is as a UNC path.  On other platforms, other mechanisms may make more sense.  The difference between platforms is not a problem.
The notion of RFC 1630 that file://hostname/path refers to the file "path" in the local filesystem of "hostname" does make some sense. A lot more sense than "clients should interpret file: URLs however they want".

I can even imagine having the browser try one or more methods to fetch "path" from "hostname". So given file://hostname/C|/dir/note.txt, it might be reasonable for the browser to fetch \\hostname\C$\dir\note.txt via SMB if possible. I could go along with that.

Interpreting file://hostname/share/note.txt by fetching \\hostname\share\note.txt still doesn't make sense according to that RFC. \share\note.txt is not a path in the local filesystem of "hostname". That, plus the fact that supporting \ as a URI separator is definitely not going to happen, means I doubt we'd get enough IE compatibility to make it useful in practice.
(In reply to comment #55)
> I can even imagine having the browser try one or more methods to fetch "path"
> from "hostname". So given file://hostname/C|/dir/note.txt, it might be
> reasonable for the browser to fetch \\hostname\C$\dir\note.txt via SMB if
> possible. I could go along with that.

Please don't do that. That opens more security issues for little benefit.

I can see the argument for not interpreting the host part differently for file URLs, although it's a weak argument. (More on that later; I'm going to dump the details of my research on this after I address your points.) If you take that line, though, then you ought to do something like refusing to interpret the path if the host does not refer to the current client. In that case I guess you'll have to look the host up as a DNS name and figure out if it refers to the local machine.

> Interpreting file://hostname/share/note.txt by fetching
> \\hostname\share\note.txt still doesn't make sense according to that RFC.
> \share\note.txt is not a path in the local filesystem of "hostname".

I can understand that point of view, but note that the path is not *required* to be a local filesystem path, and the host is not *required* to be a DNS name. More on that later.

> That, plus
> the fact that supporting \ as a URI separator is definitely not going to
> happen,

I don't think anyone is asking for that. I'm not, in any case. In fact, Firefox seems to do too much in this area already. Arguably, a URI like "file:\\pegasus\Data\stuff.c" should work fine. There is no authority field, and the path is "rootless" according to STD 66. The back-slashes should be passed to the OS for interpretation. But this nice, simple syntax doesn't work in my Firefox 1.5.07, presumably because the back-slashes are being reinterpreted as forward-slashes, and thus "pegasus" is treated as a host name.

> .. means I doubt we'd get enough IE compatibility to make it useful in
> practice.

Actually, a wide variety of file: URI forms already work in both browsers, but file references to UNCs in Firefox are inconvenient. They always seems to require a null or localhost host part. Allowing the authority field to be left out, and leaving back-slashes to the OS, as in the example above, would make it much easier for users to write reasonable-looking UNC file: references that work in both browsers.

Lack of support for file://host/share/ is a reasonable interpretation of the standard (which says little about the file URI scheme) and the informational RFC. But it's not the only possible interpretation, and it's hellishly inconvenient for corporate intranets.

Following is, mainly for the sake of convenenient reference, the result of my survey of the relevant RFCs.

----

I did a brief survey of the IETF documents related to the file URI
scheme. Basically, my conclusion is that this scheme is not at all
standardized. STD 66 (RFC 3986) mentions it briefly twice.

It is made clear that file URIs are not intended to have global scope:

    URIs have a global scope and are interpreted consistently
    regardless of context, though the result of that interpretation
    may be in relation to the end-user's context.  For example,
    "http://localhost/" has the same interpretation for every user of
    that reference, even though the network interface corresponding to
    "localhost" may be different for each end-user: interpretation is
    independent of access.  However, an action made on the basis of
    that reference will take place in relation to the end-user's
    context, which implies that an action intended to refer to a
    globally unique thing must use a URI that distinguishes that
    resource from all other things.  URIs that identify in relation to
    the end-user's local context should only be used when the context
    itself is a defining aspect of the resource, such as when an
    on-line help manual refers to a file on the end- user's file
    system (e.g., "file:///etc/hosts").

Regarding specifically the host part in file URIs, it says:

    If the URI scheme defines a default for host, then that default
    applies when the host subcomponent is undefined or when the
    registered name is empty (zero length).  For example, the "file"
    URI scheme is defined so that no authority, an empty host, and
    "localhost" all mean the end-user's machine, whereas the "http"
    scheme considers a missing authority or empty host invalid.

Section 3.22 discusse the host part of URIs and points out conventions
for them. It also makes it clear that the host part can be interpreted
with quite a bit of latitude. For instance:

    In other cases, the data within the host component identifies a
    registered name that has nothing to do with an Internet host.

This RFC doesn't define the phrase "registered name", but it does say:

    This specification does not mandate a particular registered name
    lookup technology and therefore does not restrict the syntax of
    reg- name beyond what is necessary for interoperability.  Instead,
    it delegates the issue of registered name syntax conformance to
    the operating system of each application performing URI
    resolution, and that operating system decides what it will allow
    for the purpose of host identification.  A URI resolution
    implementation might use DNS, host tables, yellow pages, NetInfo,
    WINS, or any other system for lookup of registered names.
    However, a globally scoped naming system, such as DNS fully
    qualified domain names, is necessary for URIs intended to have
    global scope.

But as mentioned earlier, file URIs are not intended to have global
scope.

RFC 1630 is informational, and does not specify a standard. The
entirety of its "definition" of the file URI scheme, which has been
selectively quoted earlier in this thread, is as follows:

      The other URI schemes (except nntp) share the property that they
      are equally valid at any geographical place.

      There is however a real practical requirement to be able to
      generate a URL for an object in a machine's local file system.

      The syntax is similar to the ftp syntax, but in this case the
      slash is used to donate [sic] boundaries between directory
      levels of a hierarchical file system is used [sic].  The
      "client" software converts the file URL into a file name in the
      local file name conventions.  This allows local files to be
      treated just as network objects without any necessity to use a
      network server for access.  This may be used for example for
      defining a user's "home" document in WWW.

      There is clearly a danger of confusion that a link made to a
      local file should be followed by someone on a different system,
      with unexpected and possibly harmful results.  Therefore, the
      convention is that even a "file" URL is provided with a host
      part.  This allows a client on another system to know that it
      cannot access the file system, or perhaps to use some other
      local mecahnism [sic] to access the file.

      The special value "localhost" is used in the host field to
      indicate that the filename should really be used on whatever
      host one is.  This for example allows links to be made to files
      which are distribted [sic] on many machines, or to "your unix
      local password file" subject of course to consistency across the
      users of the data.

      A void host field is equivalent to "localhost".

Interesting to note two things here: There are a lot of typos for such
a short bit of text, and there are very few imperatives in this
description. They are:

* The forward-slash is used to denote boundaries between directory
  levels. This implies that they'll be translated to the client's
  native syntax for directory hierarchies. Older RFCs obsoleted by
  this one even give examples of a translation to VMS' path name
  syntax.

* A void host field is equivalent to "localhost"

* "localhost" "is used ... to indicate that the filename should really
  be used ...". The "should really be" weakens even this imperative.

Nowhere does it say that the path *must* refer to a local file
system. The wording suggests that this is an intended use of the file
scheme, but also makes it clear that broader use to access files on
remote systems is a possibility.

So there is almost nothing that is standardized about this scheme, and
very little additional information provided by IETF documents.

*** Bug 358887 has been marked as a duplicate of this bug. ***
(In reply to comment #57)
> *** Bug 358887 has been marked as a duplicate of this bug. ***

The extra information, not previously noted here, from that bug is that Microsoft provide a function

   UrlCreateFromPath

which converts a file pathname into an equivalent URL. If you pass this a UNC
path, e.g.,

   \\respighi\d\tile.png

it returns a URL like

   file://respighi/d/tile.png

I.e. the use of this syntax is not just an accidental -- if common -- mistyping, but something which Microsoft apparently intend to be the canonical method of referring to a UNC path from a file: URL.
FYI: http://blogs.msdn.com/ie/archive/2006/12/06/file-uris-in-windows.aspx

This post on the IE blog confirms this to be the "proper" conversion under Windows. They have the following example:

For the UNC Windows file path
     \\laptop\My Documents\FileSchemeURIs.doc

The corresponding valid file URI in Windows is the following:
     file://laptop/My%20Documents/FileSchemeURIs.doc 
(In reply to comment #59)

However, this also works in IE 6.0:

file://localhost///laptop/My%20Documents/FileSchemeURIs.doc 

So Microsoft implicitly acknowledges the validity of the mozilla approach (strict reading of rfc 1738 para 3.10) although they deny it in this blog where they attempt to pwn Gervace Markham:
http://blogs.msdn.com/ie/archive/2006/09/13/752347.aspx

"//laptop" is part of the <path> and not really the <host>, in my opinion.
 
mass reassigning to nobody.
Assignee: dougt → nobody
So what happens with this bug? It's 3.0 now and it's still here.
Why oh why on windows systems can't we just parse the stuff as in IE?

I'm working in a mixed browser environment here and if I change all the URL's to match the Mozilla strict way of doing it, it won't work in IE.
Apparently you've decided that this is WONT FIX?

https://bugzilla.mozilla.org/show_bug.cgi?id=88293

Great, then we'll know not to use Mozilla then.
See Also: → 685853
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: