Closed Bug 92839 Opened 23 years ago Closed 23 years ago

need a common api for uploading (ftp, http, https)

Categories

(Core :: Networking, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla0.9.5

People

(Reporter: Brade, Assigned: dougt)

Details

(Whiteboard: r=darin, sr=rpotts, a=??)

Attachments

(5 files)

We need a common api for uploading files via http, https, ftp, and file 
protocols.  This will help Composer publishing.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.4
can I get a r/sr from gagan and darin?
you might add a comment stating that |stream| MUST implement |nsIInputStream::
Available|, |nsIInputStream::ReadSegments|, and that the |stream| SHOULD
implement either |nsISeekableStream| or |nsIRandomAccessStore| since an upload
request may need to be retried in some cases.

with that, sr=darin
Reversing database corruption caused by bug 95857 and bug 95798.  Pay no
attention to the man behind the curtain.
rather than changing the semantics of nsIInputStream::Available() to mean
'content-length' i think that it would be better to explicitly pass the
'content-length' of the nsIInputStream as an argument to setUploadStream(...)

so, the signature would look something like:
  setUploadStream(in nsIInputStream,
                  in unsigned long contentLength,
                  in string contentType)

Also, i think that it would be useful to have a way to pass a 'file name' rather
than an input stream along...  in the 4x days we did this if the file was too
big, we could intellegently chunk the file during the upload (rather than read
it *all* into memory).  This is critical when uploading multi-megabyte files :-)


perhaps we could also have:
  setUploadFile(in nsIFile, in string contentType)

 
1) regarding content-length as a parameter, i'm not sure if i agree.  without
it, we'd require the input stream to know the content-length which may not be
convenient if it doesn't "own" all of the data.  however, if the input stream is
required to implement nsISeekableStream, then it really must know the entire
length of the data.  moreover, it's not possible to upload a non-blocking input
stream (since we don't have any way of notifying the presence of data in the
stream), so we don't have to worry about Available() returning zero before EOF.
 this leaves me thinking that adding a content-length parameter might be
redundant.  however, perhaps it does make for a clearer API (and even a more
flexible API).

2) necko already writes data in chunks to the socket transport, so i don't see
any need to tell necko that the data is coming explicitly from a file.
As you will note, I am passing nsnull as a content type for setUploadStream.
hey darin,

my motivation behind proposing the setUploadFile(...) method was to guarantee
that chunking was performed at *all* levels not just at the socket transport...

with the current API, it is the caller's responsibility to determine the
"correct" type of stream to use :-)  This leave the door open for "less than
optimial choices" :-)

My idea behind setUpLoadFile(...) was to make *all* of the choices explicit.  If
Necko is passed an nsIFile it can make consistant (and correct) choices about
the *best* input stream to use for the given file size...

Perhaps, i'm just ignorant about the current state of the file stream APIs :-) 
maybe there is *no* way for the consumer to wrap a file in a stream which
buffers the entire file in memory...  if that's now the case, then i agree that
this is unnecessary...
chunking is not a big concern... both the file and socket transports read/write
in chunks.  BUT, having this API would allow us to use PR_SendFile to upload the
file, which would likely be big performance win since it would remove a lot of
buffering.
i guess i haven't been clear in my motivations for a setUploadFile(...) method.

you are correct that, 'chunking' per say is not the biggest issue...  I used the
 term 'chunking' to also mean that the entire file was not buffered into memory... 

What i have been trying to express is that i believe Necko should be responsible
for deciding how best to read a file - not the caller.

By putting the burden on the caller, you open the doors for some very BAD
choices and you eliminate the possibility of some optimizations...

I think that in general there are two situations with File Upload:

1. The caller already has the data in a buffer.  In this case, the
SetUploadStream(...) method should be used.
2. The caller has the 'name' of a file to be uploaded.  In this case, i believe
that the user should use a SetUploadFile(...) method.  This allows necko to make
choices about how *best* to read/send the file...

-- rick
agreed.
re-bucket-ing milestone
Target Milestone: mozilla0.9.4 → mozilla0.9.5
how about a setUploadText method?  it would make uploading much easier from JS.
should these additional setters have getters?
For "setUploadText" would I be able to upload text files only or also image 
files?  setUploadFile seems more general/appropriate (to me).
let me clarify my proposal: setUploadText would take a string argument.  so it
is very different from setUploadFile and setUploadStream.

i don't think that any of these methods should have getters, as i really can't
see any need to do so.  moreover, providing such methods may only complicate
implementations (e.g., why should i remember the nsIFile?).
there are a few places that ask the http channel for the upload/post data currently.
OK.. i wasn't aware of that.

dougt: you will also probably want to make PUT the default http method when an
upload stream has been specified.  previously it was POST, but that is not what
the publish code wants, and since the publish code isn't supposed to know that
it is talking to HTTP, we'd need to cater to it.  this shouldn't be a big change
since most callers of SetUploadStream already explicitly set the method to POST.
but, some may not.
I still don't see the point of having all of these entry points.  

SetFile - if a user has a nsIFile, they should construct a fileinputstream. if
there are problems with the way we do chunking, that is a bug in necko, right?

SetString - isn't there a way to create a string stream from JS?

IMO These extra api's are duplicates
dougt: that was my point originally, but in fact setUploadFile is good because
it would allow us to use a system call to upload the file (see sendfile(2) under
linux).  i suggested setUploadText because it would perhaps eliminate any need
to construct a string stream (which currently cannot be constructed from JS).
but, that aside, i would be willing to live without setUploadText... a string
stream implementation could (and probably should) be exposed to JS.

so, while it is true that setUploadStream + various input stream constructors
would allow clients to upload anything, there may be some advantage to more
specialized methods -- setUploadFile in particular.
I don't buy it.  first, the implementation of a file inputStream could just as
ease call PR_SendFile right?  second, we are optimizing for a case which does
not happen now and will only be used once in a while in the future.  third,
every implemention is going to implmenting it initially the same way - convert
the nsIFile to a nsIFileStream...  

with that being said, lets take a look at the callers of GetUploadStream.  Most
of these are in uri/doc loader.  Rick, can you see if we really need these? 
Also, I think that your LoadURI api is going to have to include a content type
for that upload stream....

yeah, i suppose it would be possible to QI the nsIInputStream to something that
would expose its underlying file descriptor.  then we could call sendfile if
sendfile proved to be a valuable optimization.
I guess one question is just how much you want to expose to the outside world. 
By putting the burden on the caller to create the "right kind of stream" to
enable using SendFile(...) you expose alot more of the underlying implemenation
details of file upload...

1. Probably, the interface that exposes the underlying PRFileDesc will need to
be public so that embeddors can implement their own kinds of file streams and
still benifit from this optimization.

2. This also exposes all of the underlying NSPR file descriptor stuff..
I guess that i would argue that from implementation standpoint, the implementor
of nsIUploadChannel would like to know where the data resides... ie. either in a
buffer, or on disk...  Because this allows different choices to be made at many
levels...

Doug, I guess you are right that all of this "could" be done if we just exposed
a PRFileDesc off of some "file specific" stream...  But it seems clearer to make
this distinction at the nsIUploadChannel API (at least to me :-)
-- rick
note that this patch does not include any specific http handling (eg write out
headers).  http is going to have to handle the case when a content type is
passed explictly as well as the setUploadFile api.  
i think setUploadFile should include a contentType parameter with a NULL value
meaning that the MIME service will be invoked.
other than that, would you be giving a r=?
yes
so there are still at least 2 issues which must be resolved in http before this
api can be used for publishing.

1) default http upload method should be PUT
2) http needs to add the Content-Type and Content-Length headers for the data.
   and current callers need to be modified.

also, what about this business of specifying the content-length in
setUploadStream/setUploadFile?
got r=darin, sr=rpotts.
Whiteboard: r=darin, sr=rpotts, a=??
Fixed checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
V: composer qa is testing publishing, so this is working. It would be nice if
composer had put some kind of basic publish functionality bug as a depeneds so I
could have just waited for that...

brade: one last question:

In your first comment you listed "uploading...file protocols."

What did you mean by that?
Status: RESOLVED → VERIFIED
Summary: need a common api for uploading → need a common api for uploading (ftp, http, https)
Ben--I was hoping that people could publish with a "file" url as well as with
http, https, ftp.  As it turns out, publishing supports the file url because the
persist code supports it and has special code to handle that protocol in
addition to the nsIUploadChannel stuff.
I'm not sure what that means, in an actual feature sense, since you already have
"Save/Save As...". In fact, there are still probably limitations to file URL's
because I don't think we have worked out all the escaping and local character
issues.
I figured ouw why this is desirable while I was out this week. I think
publishing file URL's is really bad, but if we are going to support compsing
documents with URL referenences, I guess we need to support file.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: