Open Bug 121059 Opened 21 years ago Updated 3 years ago

[RFE] File inputs should allow URLs


(Core :: DOM: Core & HTML, enhancement, P5)






(Reporter: john, Unassigned)


(Blocks 1 open bug)



(1 file)

Many times (for example, with HTML syntax checkers) you want to upload a file
that is somewhere on the web.  Applications nowadays handle this generally with
an extra "URL" field, and then the server side goes off and fetches the file itself.

There is no reason a file input field could not include http: and ftp: URLs in
addition to files.  This would, in many cases (though not all by any means),
eliminate the need for duplicate code on these servers and allow the user more
flexibility in sending files.  I suggest we leave the input field the way it is
now, detect these types of URLs in the input field, and then fetch and submit
the data.
Summary: File inputs should allow URLs → [RFE] File inputs should allow URLs
Solving this seems like it would make bug 114106 easier.
My modest oppinion:

This could be used by malicious users to overload another server. A "script
kiddie" will be able to hide behind a public proxy and instruct the server that
performs the fetching to attack another one (using javascript for automation for
While right now there are not much servers around that support fetching from a
third party server, this feature will encourage their proliferation (webmasters
will create cgi/php scripts, mod's and so on). Thus having a few mozilla clients
running even on modem lines would make possible the flooding of a target of
attack server with download requests from a server that's able to act as a
client cause the size of an request is small and they can be produced rapidly by
javascript. Now just think of guys that paste querry URLs (get) that would drive
a server nuts and send them with a high frequency to the fetching server. The
attacker server (if not prepared against that) cannot use the cached files
(since the queries are generated dynamicaly) and  the attacked server cannot
recognize that is an attack that easily. (much more sophisticated "script
kiddies" may also try to switch the proxies dynamicaly)

At the moment is hard to flood a server only using a regular client connection
be cause of the limitation of the connection itself (upload stream rate is
pretty low even on cable connection) But using a fetching server (that has a
backbone conn or even a T3) as a helper would not require the whole bandwidth
since the home client only has to send small packages over the line.

Maybe I'm paranoic but I think that this is a potential threat.

What does the specs (HTML 4) says about this kind of indirect uploading?
oh and what does the HTTP 1.0/1.1 specs think of this?
Mozilla would do the fetching, not the server ...
So in terms of URLs, what would happen (right now) is you'd enter a URL, Moz
would grab the file from that location on submit, and then Moz would upload it
to the form target.  In fact, with this method we could *avoid* the type of DOS
attack that could be launched against places like

We're protected on the client against most attacks because JS and <input value=>
cannot change the value of a file input.

As for the specs ... the strongest thing I could the specs say about it is this:

* The current value of a file select is a list of one or more file names. Upon
submission of the form, the contents of each file are submitted with the rest of
the form data. The file contents are packaged according to the form's content type.

I'm pretty sure they weren't even thinking about URLs in the spec, but I really
don't think it causes any *problems* with the spec.  More importantly, it
shouldn't break compatibility; it's a UI change that does not affect the APIs at
all.  There are no interface issues because the interface is unspecified; and
this interface is backwards compatible with the one we have now except if you
have a file named (for example).  I think that's a small
price to pay for the functionality.

I think that bug 46135 lends a great deal of weight and credence to the idea
that a file input is just a way of getting a discrete block of data up to the
server.  That bug looks like W3C guys are backing it.
A final note, since I noticed you talked about the HTTP specs ... they do not
say a word about where you get the data from, last I checked.  They just know
you are sending names and values, or names and files.
well, if mozilla fetches the file, that is a completly different thing.
i based my comment on 

"and then the server side goes off and fetches the file itself."
from the first comment. my mistake.

the ideea of having mozilla fetching the file is far, far, far better. 
Severity: normal → enhancement
Priority: -- → P5
Target Milestone: --- → Future
Blocks: 124238
Duplicate of this bug: 412877
Assignee: john → nobody
Depends on: 412822, 412867
QA Contact: vladimire → form-submission
Comment on attachment 298488 [details] [diff] [review]
draft (uses instead of nsIChannel.asyncOpen)

I don't think getting the async version to work is practical within any reasonable time period (the cascade of reworks required hit way too many files, and the number of buffers and copies is really scary). I'd like to have this reviewed for general correctness. If it's correct, modulo the sync Open call, I'd like to be able to have that recorded.

If the sync open call isn't acceptable to module owners for inclusion into, that's fine, but we (Nokia) intend to use this code for our builds.
Attachment #298488 - Flags: review?(jonas)
If you want this for firefox 3 you'll need to push it at the Firefox meeting or in the newsgroups. If not I'll just leave it in my review queue until after the FF3 release.
Drive-by comment: nsDOMFile::GetStream should be returning already_AddRefed<nsIInputStream>, right?  As written, the code leaks the streams...
yes, thanks.
You should probably return an empty string rather than "unknown" for non-urls.
I'd rather not use "" as using "" is asking for something to fail on the server side. how about i use the protocol name?

one example input is:

if i use the protocol name, then when uploading you'd be told "data". which seems vaguely useful (certainly an improvement over "" and probably over "unknown").

And is that the only feedback you have?
Comment on attachment 298488 [details] [diff] [review]
draft (uses instead of nsIChannel.asyncOpen)

> nsDOMFile::GetFileName(nsAString &aFileName)
> {
>-  return mFile->GetLeafName(aFileName);
>+  nsCOMPtr<nsIURL> url(do_QueryInterface(mURI));
>+  if (url) {
>+    nsCAutoString fileName;
>+    nsresult rv = url->GetFilePath(fileName);
>+    if (NS_SUCCEEDED(rv)) {
>+      CopyUTF8toUTF16(fileName, aFileName);
>+      return NS_OK;
>+    }
>+  }
>+  aFileName.AssignLiteral("unknown");

Yeah, using the scheme here might be a good idea.

> nsDOMFile::GetFileSize(PRUint64 *aFileSize)
> {
>+  nsCOMPtr<nsIFileURL> fileURL(do_QueryInterface(mURI));
>+  if (!fileURL)
>+  nsCOMPtr<nsIFile> file;
>+  fileURL->GetFile(getter_AddRefs(file));
>+  if (!file)

Can you do
rv = fileURL->GetFile(getter_AddRefs(file));

here instead?

>@@ -114,18 +134,16 @@ nsDOMFile::GetFileSize(PRUint64 *aFileSi
>+  nsresult rv;
>   nsCAutoString charsetGuess;
>+  nsCOMPtr<nsIChannel> channel;
>+  nsCOMPtr<nsIInputStream> stream(GetStream(&rv, getter_AddRefs(channel)));
>+  if (NS_FAILED(rv))
>+    return rv;
>   if (!aCharset.IsEmpty()) {
>     CopyUTF16toUTF8(aCharset, charsetGuess);
>-  } else {
>+  } else if (channel && NS_FAILED(channel->GetContentCharset(charsetGuess))) {

You probably want to forward the error if GetContentCharset fails here.

Looks ok otherwise.

But like I said, i don't really think we'd want to take this in the main tree due to the sync-loading, so not setting r+ due to that.
Attachment #298488 - Flags: review?(jonas)
timeless, did you want this in 1.9.1?
Blocks: 682838
Component: HTML: Form Submission → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.