Last Comment Bug 371432 - Need an API to store <input type=file> data offline and access when online
: Need an API to store <input type=file> data offline and access when online
Status: RESOLVED FIXED
: dev-doc-complete
Product: Core
Classification: Components
Component: Networking: Cache (show other bugs)
: Trunk
: x86 All
: -- normal (vote)
: mozilla1.9alpha8
Assigned To: Dave Camp (:dcamp)
:
Mentors:
http://wiki.mozilla.org/OfflineAppsSu...
Depends on: 404127 389626 403852 403856
Blocks: 398161
  Show dependency treegraph
 
Reported: 2007-02-23 14:15 PST by Mark Finkle (:mfinkle) (use needinfo?)
Modified: 2007-11-16 20:24 PST (History)
32 users (show)
benjamin: blocking1.9+
jwalden+bmo: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP (42.13 KB, patch)
2007-06-24 19:06 PDT, Dave Camp (:dcamp)
no flags Details | Diff | Splinter Review
implement fileList element (38.04 KB, patch)
2007-07-08 23:53 PDT, Dave Camp (:dcamp)
no flags Details | Diff | Splinter Review
now with nsIDOMFile.idl (40.62 KB, patch)
2007-07-09 23:57 PDT, Dave Camp (:dcamp)
jonas: review-
Details | Diff | Splinter Review
updated patch (43.65 KB, patch)
2007-07-16 13:32 PDT, Dave Camp (:dcamp)
jonas: review+
Details | Diff | Splinter Review
upload binary data with xhr (2.79 KB, patch)
2007-07-22 17:22 PDT, Dave Camp (:dcamp)
no flags Details | Diff | Splinter Review
updated fileList patch (44.08 KB, patch)
2007-07-22 17:26 PDT, Dave Camp (:dcamp)
jonas: review+
jonas: superreview+
Details | Diff | Splinter Review
use the system charset by default (44.49 KB, patch)
2007-07-23 15:00 PDT, Dave Camp (:dcamp)
smontagu: review+
Details | Diff | Splinter Review
check for invalid characters (3.72 KB, patch)
2007-07-23 15:01 PDT, Dave Camp (:dcamp)
jonas: review+
jonas: superreview+
Details | Diff | Splinter Review
xhr patch as committed (with iid update) (4.95 KB, patch)
2007-07-24 22:14 PDT, Dave Camp (:dcamp)
no flags Details | Diff | Splinter Review

Description Mark Finkle (:mfinkle) (use needinfo?) 2007-02-23 14:15:34 PST
It is not possible to store the file location in cache or DOMStorage when offline and then set the data back into <input type=file> form controls when the application comes back online.

Javascript security doesn't allow script to set the control's value. We need a way for the form or window to be able to trust the data that was stored offline so it can be used to post to a server when the application is online.

An example of where this is useful is an offline email aaplication that allows file attachments.
Comment 1 Vladimir Vukicevic [:vlad] [:vladv] 2007-02-23 14:29:22 PST
I don't think it's enough "to trust the data that was stored offline so it can be used to POST"; I'm not quite sure what that means.

I think that you should be able to get at the data in the form onsubmit handler (as a data URI?); at that point you can store it off into storage and do whatever you want with it later.
Comment 2 Mark Finkle (:mfinkle) (use needinfo?) 2007-02-23 14:33:53 PST
I was thinking of just storing the reference to the file(s) instead of getting access to the contents of the file (which could be huge).

Your approach could be nicer in the given example (attachments for email) since the attachment would be the contents of the file at the time the email was submitted.
Comment 3 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2007-02-23 16:05:44 PST
I'm with vlad. In fact this would be useful for many reasons.
Comment 4 Jesse Ruderman 2007-02-23 19:08:55 PST
Storing a reference to the file could lead to the site getting a different version of the file than the one I gave it.  I think that would be bad for security, so I'm with vlad and roc in preferring giving sites the current contents of the file rather than the capability to access the file later.
Comment 5 Mark Finkle (:mfinkle) (use needinfo?) 2007-02-23 20:09:21 PST
Ok, reference to file is a bad idea. Does this have to work with <form> and only be available in onsubmit? Does a data URI make it easier or harder to store? When the client comes back online does the data URI make it easier or harder to POST or  work with the data?

What are the common use cases for using the data? If it is to continue the file POST operation, should the API be geared toward multipart form POST format using XHR?

I am asking so we can start thinking about the API that needs to be created.
Comment 6 Vladimir Vukicevic [:vlad] [:vladv] 2007-02-23 20:40:46 PST
There are really two main use cases:

- Where the file is interesting as just a file, e.g. an attachment
 - this is the "attach a file to email" case

- Where the data is handed to the web app for additional processing
 - e.g., this is google docs "opening" an excel/word spreadsheet
 - e.g., an app that wants to read the data and set some options/config flags

Given that I think the second is going to be pretty important, I don't think the API should be focused on POST'ing the data.  In fact, I think that POST is actually not that interesting, as long as you can encode the data in a useful XHR.

data: URIs are not ideal, but they have the advantage of being easily understood, contain both the mime type and format info (and we can attach disposition bits like filename just fine within the spec).  That still doesn't really solve the second problem, because we'd need to be able to read from it in a useful way.  If you know the data format, you can load it in an iframe and then access the DOM... you can also decode it (e.g. if you know it's XML) and use E4X.

data: URIs are not ideal because they /suck/ to process, and they consume stupid amounts of memory.  We do a lot of URI copying, and with data: URIs we end up copying the full (encoded) data each time.  That sucks, but I don't know of an alternative.  (darin and I came up with a scheme where you would get back some kind of handle type URI, e.g. "x-data-handle:123" which would refer to the real content whenever it was loaded, but we never figured out how to expire it.  We could certainly make it an app-cooperative thing; I think better data access from JS would be more useful, but I don't have any ideas yet.
Comment 7 Hixie (not reading bugmail) 2007-02-23 21:00:03 PST
Could you elaborate on your second use case with an actual example? I'm not sure I follow what kind of Web app would end up in this situation.
Comment 8 Anne (:annevk) 2007-02-24 01:50:05 PST
Some ideas are in here: http://www.w3.org/TR/file-upload/
Comment 9 Vladimir Vukicevic [:vlad] [:vladv] 2007-02-24 15:04:39 PST
Hey, a lot of that stuff in the TR looks exactly like what we're looking for.

Hixie, pretty much all web apps that do something with data need the second case.  Google docs/spreadsheets is the easiest example; open it up, then click Upload.  You can give a local document, either as text, CSV, word, excel, etc., and it'll turn it into a google docs/sheets document.  While it's not really feasable to write an excel parser in javascript, it is more than reasonable to allow uploading or "opening" of text and/or CSV files via an offline version of docs/spreadsheets, to convert it to the internal format that can then be synced to the server.

Another example is an app that i'm working on which uses as input an XML document obtained from another service; it can't get it directly due to cross-domain xmlhttprequest stuff (at least until fx3), and besides, the data is in a password-protected domain and might be generated in another way.  What I want is a way to let the user save the XML file to local disk, then just hand it to my (entirely client-side) web app to parse and do stuff with.
Comment 10 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2007-02-24 16:11:00 PST
One problem with that file-upload spec is that it doesn't integrate with <input type="file"> elements, which would be highly desirable.
Comment 11 Mark Finkle (:mfinkle) (use needinfo?) 2007-02-24 16:16:05 PST
The file-upload spec has some interesting pieces and I agree with Robert about integration with <input type="file">.

I would also like to make sure that it's possible for extensions to get access to the file data. It is entirely feasible that web-app specific extensions could appear that allowed "advanced" handling of local files when offline, such as converting Word and Excel to Google Docs and Spreadsheets format - while offline.
Comment 12 Hixie (not reading bugmail) 2007-02-24 20:26:52 PST
> Hixie, pretty much all web apps that do something with data need the second
> case.  Google docs/spreadsheets is the easiest example; open it up, then click
> Upload.  You can give a local document, either as text, CSV, word, excel, 
> etc., and it'll turn it into a google docs/sheets document.  While it's not 
> really feasable to write an excel parser in javascript, it is more than 
> reasonable to allow uploading or "opening" of text and/or CSV files via an 
> offline version of docs/spreadsheets, to convert it to the internal format 
> that can then be synced to the server.

But then you're not uploading the file. You're uploading text you have control over. Here, what you're asking for is a way to get to the data chosen by the user of the <input type=file> element, not a way to defer uploading or a way to request uploading "by reference rather than by value" (if you see what I mean).
(I guess I misread what you wrote above. My bad.)


> What I want is a way to let the user save the XML file to local disk, then 
> just hand it to my (entirely client-side) web app to parse and do stuff with.

Right. So, a file API, not a deferred file upload API. Ok. I agree that's useful. You should work with the group doing the draft cited above.

(I agree that the draft's major problem is that it doesn't integrate with <input type=file>, which we would desire as a way to reduce the target surface area.)
Comment 13 Anne (:annevk) 2007-02-25 08:48:06 PST
The main feedback on the draft was that instead of invoking the open() method it defines now is dispatching the event directly on the <input type=file> control (where you can upload multiple files per Web Forms 2) the moment the user has selected the files. This shouldn't be much less secure from what you can do now (invoking .submit() and processing the files on the server).
Comment 14 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2007-02-25 13:27:45 PST
That sounds good. So we could just lose the FileDialog interface, make open() a method in the <input> element and make FileList be an interface on the <input> element?
Comment 15 Anne (:annevk) 2007-02-25 13:44:21 PST
My suggestion would to let the 'files-selected' dispatch on <input> the moment the files are available. The event object then has a pointer to the FileList object.
Comment 16 Hixie (not reading bugmail) 2007-02-25 22:02:45 PST
Please use the 'change' and 'input' events, not a new one ('files-selected')...
Comment 17 Anne (:annevk) 2007-02-26 10:18:48 PST
Yeah, never mind my suggestion above. Give HTMLInputElement a new member called fileList which is initially null and FileList once one or more files are selected.
Comment 18 Benjamin Smedberg [:bsmedberg] 2007-06-19 18:46:34 PDT
Throwing features without owners off the 1.9 train.
Comment 19 Mike Shaver (:shaver -- probably not reading bugmail closely) 2007-06-19 19:39:12 PDT
Maybe I'm not up to date on our current triage model, but without this the offline app story is pretty weak.  Why wouldn't blocking be distinct from whether it has an owner, and blocking status be used to allocate resources as they become available, or even adjust schedule?  I think we would be making a pretty big mess if we shipped Gecko 1.9/Fx3 without this, so I'm re-?ing for reconsideration.
Comment 20 Dave Camp (:dcamp) 2007-06-19 19:40:14 PDT
And I'm happy to take it, I have a patch mostly ready
Comment 21 Hixie (not reading bugmail) 2007-06-20 02:56:11 PDT
The TR/file-upload draft seems to be dead.

I intend to add the following to Web Forms 2 in due course:

   HTMLInputElement
     readonly attribute FileList files;
      // null unless type=file
      // if type=file, returns a FileList object with all the currently selected
      // files. The 'change' and 'input' events are fired when the list changes.

   FileList
      // list of File objects, with the length attribute, item() method, and 
      // [[Get]] behaviour typical of DOM lists

   File
      // some API to read the file.

I don't know what a good way of exposing File would be. What's in the TR/file-upload draft is kinda awkward. You'd really want a text version and a binary version (byte array), and possibly a version that is the result of parsing the file with an XML or HTML parser (a DOM Document version). JavaScript doesn't really have a good story for byte arrays, though. The encoding detection here would probably be similar to the XMLHttpRequest stuff. You might want to expose the filename (though not the path) and the MIME type (though that's unreliable, so maybe not?).

If you implement this let me know and tell me what your experience was. I'd love to take it into account when writing the spec.
Comment 22 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2007-06-20 18:27:19 PDT
Instead of a byte array, a string should work fine (assuming we assume 8859-1 encoding, i.e. each byte padded with a null byte to form a UTF16 codepoint). Or we could return a data: URI. Both APIs could be useful in different situations.
Comment 23 Hixie (not reading bugmail) 2007-06-20 22:44:57 PDT
A string is pretty inefficient, and is susceptible to all kinds of problems later if we extend this API to, e.g., allow the binary data to be posted. e.g. what happens if you have a character with a codepoint greater than 255? Or what if the string goes through some kind of CRLF normalisation? Seems cleaner to have a real byte array if we can do it...

A data: URI could be useful too (the TR/file-upload draft has base64 and hex strings as the possible encodings). However, people haven't been overly impressed with our use of data: URIs in <canvas>, so I wouldn't run to that solution.
Comment 24 Dave Camp (:dcamp) 2007-06-24 19:06:36 PDT
Created attachment 269618 [details] [diff] [review]
WIP

Here's a WIP implementation of fileList for comment.

I went with the 8859-1-stuffed-in-a-string approach for now.  It has the benefit of easily working with globalStorage without needing to rework that interface.  I also included a base64 encoded version for building data: uris.

For this to be useful we probably need some good way to get a file stored in globalStorage up to the server.  Maybe some way to tell XMLHttpRequest to do encode as binary (lossily convert to 8bit chars and don't do any funny stuff with newlines)?

The encoding stuff probably needs some work.  Maybe the default encoding should be the system encoding rather than UTF-8?  Maybe we should use the charset detection stuff?
Comment 25 Anne (:annevk) 2007-06-25 01:58:23 PDT
I would look at the BOM, use OS metadata (if any) and default to UTF-8.

Internet Explorer 7 exposes responseBody on XMLHttpRequest that returns an array of unsigned integers representing octets per MSDN. The send() method also accepts that as argument. I haven't been able to reverse engineer that so far though, but then I haven't tried very hard. Maybe that's worth looking into.
Comment 26 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-06-28 17:02:05 PDT
Me and Dave talked about this a bit yesterday. These were the things we thought of that you might want to do with a file:

* Store the file in DOM-storage
* Send the file using XMLHttpRequest
* Display the file as an image on the page
* Treat the file as a textfile and read its contents
* Treat the file as an XML or HTML document
* Treat the file as binary and parse its contents

The first two would let you do offline email with file attachments by storing the file in DOM-storage, and then sending it using XHR.

The third is useful for forms that takes images such as avatars, buddy icons, flickr etc since it can show the chosen image before it's sent to the server.

The fourth is useful to parse file formats that use text files, and also lets you insert text documents as body for emails etc.

Fifth is useful to interact with files that the user got from the web or plans to upload to the web.

Last is a catch-all that lets the webpage interact with any other type of format that doesn't fall into any of the previous categories.


So what I suggest is the following API:

File {
  readonly attribute DOMString      fileName;
  readonly attribute unsigned long  fileSize;

  DOMString getText(DOMString encoding);
  DOMString getAsBase64();
  DOMString getAsBinary();
  Document parseAsDocument();
};

The 'encoding' argument to getText() lets the script guess the encoding based on either asking the user, or using the common encoding for the target audience of the website. If the encoding is left empty some default algorithm can be used to guess the encoding (such as look for a BOM) and throw if none is found.

getAsBase64() can be used to build a data-uri which can then be set on an image tag or iframe to preview the document.

getAsBinary() can be used to extract the data and store it in DOM-Storage or parse it. This would expand the data to twice its normal size, but I don't think this is a huge deal compared to the complexity of parsing the data. If we wanted to make it slightly more efficient to deal with large files we could have getSectionAsBinary(long start, long end) to let the code just keep parts of the file expanded at a time. This function also lets you extract the data as a string to put in DOM storage or send using XHR.

parseAsDocument() allows you to treat the file as a HTML or XML file.


Apart from the problem that getAsBinary returns a bloated string, that string is somewhat problematic to send over XHR. This is because XHR will utf8 encode the string which will make converting it back to binary data somewhat cumbersome.

One way to solve this is to let XHR take an encoding in addition to the string to be sent. This would let you specify an encoding that will convert the expanded string into binary data before it's sent over the wire.

Another solution would be to let both domstorage and XHR to take File objects directly in addition to the types they are currently able to send.
Comment 27 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2007-06-28 17:23:57 PDT
That sounds good.

You might want to add a readonly attribute DOMString "fileType", to get/guess the MIME type of the file (based on metadata, file extension associations, sniffing or whatever else the underlying platform wants to do).

Maybe we should move this discussion to the WHATWG list (but we could start implementing it as well).
Comment 28 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-06-28 17:28:17 PDT
I'm a bit reluctant to advocate APIs that guess without also specifying exactly how that guessing is done. Otherwise we end up with UAs having to reverse engineer each other and implementations that grow in size and complexity.

I actually think we should take this to the WebAPI group. They would love to have someone write this up, all they need is someone to step up and take the time.
Comment 29 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2007-06-28 17:46:38 PDT
> I'm a bit reluctant to advocate APIs that guess without also specifying exactly
> how that guessing is done.

We're not talking about guessing the MIME type of Web content here, just the user's local file data. So the interoperability concerns are not large; if a particular file is assigned a different type by different browsers, nothing much breaks. At worst a user trying to work with the same file in web apps running in different browsers might get confused.

In fact the method will definitely be platform dependent, since different platforms have different approaches and capabilities for determining the type of local file data; it also typically depends on what applications have been installed or whatever. It's related to the question of what happens when the user double-clicks on the file --- I don't think we can/should strive for universal compatibility there.

Web apps will not be able to rely on the type being accurate, but they're no worse off than desktop apps in that respect. Providing our guess will make it a lot easier for apps to implement your scenario of turning a file into a data: URL to display as an image or whatever.

Actually as a related issue we might want to provide the platform's icon for the file, say as a data: URL. I bet attachment users such as GMail would love to have that.
Comment 30 Hixie (not reading bugmail) 2007-06-28 18:19:39 PDT
Comment 21 plus comment 26 looks good to me. E-mail the WHATWG list to let people know you're implementing this and I'll add it to the spec in a few weeks.
Comment 31 Nickolay_Ponomarev 2007-06-29 07:38:31 PDT
(In reply to comment #26)
>   DOMString getText(DOMString encoding);
> 
> The 'encoding' argument to getText() lets the script guess the encoding based
> on either asking the user, or using the common encoding for the target audience
> of the website. If the encoding is left empty some default algorithm can be
> used to guess the encoding (such as look for a BOM) and throw if none is found.

I don't think the webapp can guess the encoding the user uses in a text file. How would it work? It can't ask the user what encoding the file is in, and it can't guess the encoding based on the language in its settings or on the HTTP headers - at least because there are at least several encodings the file could be in: the default encoding for the language (there could be several), and various flavours of Unicode.

Wouldn't it be more useful to let our charset detectors do the work? And maybe let the user select the encoding (with file contents preview) in case we mess up.
Comment 32 Dave Camp (:dcamp) 2007-07-08 23:53:05 PDT
Created attachment 271493 [details] [diff] [review]
implement fileList element
Comment 33 Dave Camp (:dcamp) 2007-07-09 23:57:01 PDT
Created attachment 271625 [details] [diff] [review]
now with nsIDOMFile.idl
Comment 34 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-10 17:54:35 PDT
Comment on attachment 271625 [details] [diff] [review]
now with nsIDOMFile.idl

>diff -r 8cf7bb07f12a content/base/public/nsDOMFile.h
>+class nsDOMFileList : public nsIDOMFileList
>+{
>+public:
>+  NS_DECL_ISUPPORTS
>+  NS_DECL_NSIDOMFILELIST
>+
>+  PRBool Append(nsIDOMFile *aFile);
>+
>+  PRBool Remove(PRUint32 aIndex);
>+  void Clear();

Just inline the impl of these three since the impl is so simple and non-virtual.


>diff -r 8cf7bb07f12a content/base/src/nsDOMFile.cpp
>+nsDOMFile::GetFileSize(PRUint32 *aFileSize)
>+{
>+  PRInt64 fileSize;
>+  nsresult rv = mFile->GetFileSize(&fileSize);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  if (fileSize < 0) {
>+    return NS_ERROR_FAILURE;
>+  }
>+
>+  *aFileSize = fileSize;

You should cap at 2^32-1 rather than wrap here. And add a comment to that extent in the idl. Or could we even make the idl use a 64bit? If not we can always add that later as long as we document the current behavior.

>+nsDOMFile::GetText(const nsAString &aCharset, nsAString &aResult)
>+{
>+  aResult.Truncate();
>+
>+  nsCOMPtr<nsIInputStream> stream;
>+  nsresult rv = NS_NewLocalFileInputStream
>+                  (getter_AddRefs(stream),
>+                   mFile, -1, -1,
>+                   nsIFileInputStream::CLOSE_ON_EOF);
>+  if (NS_FAILED(rv)) return DOMFileResult(rv);

Use NS_ENSURE_SUCCESS(rv, DOMFileResult(rv)), same thing in other places in this file.

>+  nsCAutoString charsetGuess;
>+  if (!aCharset.IsVoid()) {

Use IsEmpty rather than IsVoid, it's such a pain for C++ callers to have to void their strings.

more to come
Comment 35 Benjamin Smedberg [:bsmedberg] 2007-07-11 11:17:08 PDT
blocking+ per platform meeting
Comment 36 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-11 17:23:06 PDT
Comment on attachment 271625 [details] [diff] [review]
now with nsIDOMFile.idl

Two functional changes we talked about:

We might want to change GetAsBase64 to GetAsDataURL since that way the user won't have to figure out the mimetype.

We might as well do our best to figure out the encoding, rather than just looking at the BOM. This shouldn't be a web compatibility issue since these aren't files from the web but rather from the users local filesystem.

>+nsDOMFile::GetAsBinary(nsAString &aResult)
>+{
>+  aResult.Truncate();
>+
>+  nsCOMPtr<nsIInputStream> stream;
>+  nsresult rv = NS_NewLocalFileInputStream
>+                  (getter_AddRefs(stream),
>+                   mFile, -1, -1,
>+                   nsIFileInputStream::CLOSE_ON_EOF);
>+  if (NS_FAILED(rv)) return DOMFileResult(rv);
>+
>+  char readBuf[4096];
>+  PRUint32 numRead;
>+  rv = stream->Read(readBuf, sizeof(readBuf), &numRead);
>+  while (NS_SUCCEEDED(rv) && numRead > 0) {
>+    AppendASCIItoUTF16(Substring(readBuf, readBuf + numRead), aResult);
>+    rv = stream->Read(readBuf, sizeof(readBuf), &numRead);
>+  }
>+
>+  return DOMFileResult(rv);

Would that loop be better written as a do-while? Also, I generally don't like functions that end with |return rv;|, it's better to deal with error-codes right away to not risk someone over-writing them in later added code.

>+nsDOMFile::ParseAsDocument(nsIDOMDocument **aContentsXML)
>+{
>+  if (mDocument) {
>+    NS_ADDREF(*aContentsXML = mDocument);
>+    return NS_OK;
>+  }
>+
>+  nsCOMPtr<nsIURI> fileURI;
>+  nsresult rv = NS_NewFileURI(getter_AddRefs(fileURI), mFile);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  nsCOMPtr<nsIChannel> channel;
>+  rv = NS_NewChannel(getter_AddRefs(channel), fileURI);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  nsCOMPtr<nsISyncLoadDOMService> syncLoadService =
>+    do_GetService(NS_SYNCLOADDOMSERVICE_CONTRACTID);
>+  if (!syncLoadService) {
>+    return NS_ERROR_FAILURE;
>+  }

So the problem with the syncloadservice is that it only creates XML documents. It can't be used to create HTML documents :(

We might want to leave this function out for now. Unless you can figure out an easy way to get the syncloader to do the right thing.

>diff -r 8cf7bb07f12a content/html/content/src/nsHTMLInputElement.cpp
>+nsHTMLInputElement::GetFile(nsIFile** aFile)
>+{
>+  *aFile = nsnull;
>+
>+  if (!mFileName) {
>+    return NS_ERROR_NOT_AVAILABLE;
>+  }

Please also abort if |mType != NS_FORM_INPUT_FILE|, just as an extra level of safety. Actually it might even be needed if someone grabs .fileList and then changes the type.

>+nsHTMLInputElement::UpdateFileList()
>+{
>+  if (mFileList) {
>+    mFileList->Clear();
>+
>+    if (mFileName) {
>+      nsCOMPtr<nsIFile> file;
>+      nsresult rv = GetFile(getter_AddRefs(file));
>+      NS_ENSURE_SUCCESS(rv, rv);
>+
>+      nsIPrincipal* principal = nsnull;
>+      nsIDocument* doc = GetCurrentDoc();
>+      if (doc) {
>+        principal = doc->NodePrincipal();
>+      }
>+
>+      nsRefPtr<nsDOMFile> domFile = new nsDOMFile(file, principal);

Just use NodePrincipal() directly, that function lives on all nodes, no need to get it off of the document.

>+      if (domFile) {
>+        rv = mFileList->Append(static_cast<nsIDOMFile*>(domFile));

You shouldn't need to do this cast explicitly.
Append returns a PRBool, not an nsresult.

You need to call UpdateFileList when mType is switched to/from type=file, so that the file-list is empty when the type isn't file.

Marking r- for the bigger changes we talked about.
Comment 37 Dave Camp (:dcamp) 2007-07-16 13:32:03 PDT
Created attachment 272539 [details] [diff] [review]
updated patch

* Changed getAsBase64() to getAsDataURL()
* Removed parseAsDocument() for now
* Renamed getTest() to getAsTest()
* Try harder to detect the charset of the local file 
* Made fileSize 64 bit
* Fixes for the other comments
Comment 38 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-17 16:27:53 PDT
Comment on attachment 272539 [details] [diff] [review]
updated patch

>diff -r ff3f734ca97a content/base/src/nsDOMFile.cpp
>+nsDOMFile::GuessCharset(nsIInputStream *aStream,
>+                        nsACString &aCharset)
>+{
>+
>+  if (!mCharset.IsEmpty()) {
>+    aCharset = mCharset;
>+    return NS_OK;
>+  }
>+
>+  // First try the universal charset detector
>+  nsCAutoString detectorContractID;
>+  detectorContractID.AssignLiteral(NS_CHARSET_DETECTOR_CONTRACTID_BASE);
>+  detectorContractID.AppendLiteral("universal_charset_detector");
>+
>+  nsCOMPtr<nsICharsetDetector> detector;
>+  detector = do_CreateInstance(detectorContractID.get());

No need for detectorContractID, just do:
detector = do_CreateInstance(NS_CHARSET_DETECTOR_CONTRACTID_BASE
                             "universal_charset_detector");

>diff -r ff3f734ca97a content/html/content/src/nsHTMLInputElement.cpp
>@@ -1887,32 +1952,35 @@ nsHTMLInputElement::ParseAttribute(PRInt
>                                    nsAttrValue& aResult)
> {
>   if (aNamespaceID == kNameSpaceID_None) {
>     if (aAttribute == nsGkAtoms::type) {
>       // XXX ARG!! This is major evilness. ParseAttribute
>       // shouldn't set members. Override SetAttr instead
>       if (!aResult.ParseEnumValue(aValue, kInputTypeTable)) {
>         mType = NS_FORM_INPUT_TEXT;
>+        UpdateFileList();
>         return PR_FALSE;
>       }

Call |SetFileName(EmptyString());| instead so that mFileName is nulled out too.

Also fix the Append calls return-value.

r=me with that.
Comment 39 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-17 16:38:40 PDT
Comment on attachment 272539 [details] [diff] [review]
updated patch

Actually, rather than inserting the SetFileName() call when ParseEnumValue fails, you could do something like this:

PRInt32 newType = aResult.ParseEnumValue(aValue, kInputTypeTable) ?
                  newType = aResult.GetEnumValue() :
                  NS_FORM_INPUT_TEXT;

And remove the first if-statement completely. That should make us fall into the right code paths.
Comment 40 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-17 16:44:00 PDT
Hrm.. except that you also need to return PR_FALSE...
Comment 41 Dave Camp (:dcamp) 2007-07-22 17:22:12 PDT
Created attachment 273345 [details] [diff] [review]
upload binary data with xhr

Here's the accompanying patch to upload data from the fileList with XHR.

In the future it would be nicer to let DOM storage store File objects and XHR upload them.
Comment 42 Dave Camp (:dcamp) 2007-07-22 17:26:15 PDT
Created attachment 273346 [details] [diff] [review]
updated fileList patch

Simon, would you mind checking my charset detector use?
Comment 43 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-22 19:06:24 PDT
Comment on attachment 273345 [details] [diff] [review]
upload binary data with xhr

Hmm.. might be good to throw an exception if there are high-order bytes in the string.

looks good otherwise.
Comment 44 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-22 19:10:25 PDT
Comment on attachment 273346 [details] [diff] [review]
updated fileList patch

r/sr=me though it would be great to get simons review on the charset stuff.
Comment 45 Simon Montagu :smontagu 2007-07-22 22:13:22 PDT
I think I would use the locale charset rather than UTF-8 as final fallback. This will be the encoding of e.g. local files saved as "ANSI" by Notepad on Windows, and in most cases the charset detector won't have detected it correctly since it doesn't detect most of the ISO-8859 family at all.
Comment 46 Dave Camp (:dcamp) 2007-07-23 15:00:13 PDT
Created attachment 273470 [details] [diff] [review]
use the system charset by default
Comment 47 Dave Camp (:dcamp) 2007-07-23 15:01:29 PDT
Created attachment 273471 [details] [diff] [review]
check for invalid characters
Comment 48 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-23 16:32:34 PDT
Comment on attachment 273471 [details] [diff] [review]
check for invalid characters

Rev the IID on nsIXMLHttpRequest

I like NS_ENSURE_TRUE(data, NS_ERROR_OUT_OF_MEMORY), but some don't, so up to you.

r/sr=me either way
Comment 49 Dave Camp (:dcamp) 2007-07-24 21:39:09 PDT
fileList patch committed:

Checking in content/base/public/Makefile.in;
/cvsroot/mozilla/content/base/public/Makefile.in,v  <--  Makefile.in
new revision: 1.67; previous revision: 1.66
done
RCS file: /cvsroot/mozilla/content/base/public/nsDOMFile.h,v
done
Checking in content/base/public/nsDOMFile.h;
/cvsroot/mozilla/content/base/public/nsDOMFile.h,v  <--  nsDOMFile.h
initial revision: 1.1
done
RCS file: /cvsroot/mozilla/content/base/public/nsIDOMFile.idl,v
done
Checking in content/base/public/nsIDOMFile.idl;
/cvsroot/mozilla/content/base/public/nsIDOMFile.idl,v  <--  nsIDOMFile.idl
initial revision: 1.1
done
RCS file: /cvsroot/mozilla/content/base/public/nsIDOMFileException.idl,v
done
Checking in content/base/public/nsIDOMFileException.idl;
/cvsroot/mozilla/content/base/public/nsIDOMFileException.idl,v  <--  nsIDOMFileException.idl
initial revision: 1.1
done
RCS file: /cvsroot/mozilla/content/base/public/nsIDOMFileList.idl,v
done
Checking in content/base/public/nsIDOMFileList.idl;
/cvsroot/mozilla/content/base/public/nsIDOMFileList.idl,v  <--  nsIDOMFileList.idl
initial revision: 1.1
done
Checking in content/base/src/Makefile.in;
/cvsroot/mozilla/content/base/src/Makefile.in,v  <--  Makefile.in
new revision: 1.109; previous revision: 1.108
done
RCS file: /cvsroot/mozilla/content/base/src/nsDOMFile.cpp,v
done
Checking in content/base/src/nsDOMFile.cpp;
/cvsroot/mozilla/content/base/src/nsDOMFile.cpp,v  <--  nsDOMFile.cpp
initial revision: 1.1
done
Checking in content/html/content/src/Makefile.in;
/cvsroot/mozilla/content/html/content/src/Makefile.in,v  <--  Makefile.in
new revision: 1.74; previous revision: 1.73
done
Checking in content/html/content/src/nsHTMLInputElement.cpp;
/cvsroot/mozilla/content/html/content/src/nsHTMLInputElement.cpp,v  <--  nsHTMLInputElement.cpp
new revision: 1.463; previous revision: 1.462
done
Checking in dom/public/nsDOMClassInfoID.h;
/cvsroot/mozilla/dom/public/nsDOMClassInfoID.h,v  <--  nsDOMClassInfoID.h
new revision: 1.23; previous revision: 1.22
done
Checking in dom/public/nsDOMError.h;
/cvsroot/mozilla/dom/public/nsDOMError.h,v  <--  nsDOMError.h
new revision: 3.17; previous revision: 3.16
done
Checking in dom/public/idl/html/nsIDOMNSHTMLInputElement.idl;
/cvsroot/mozilla/dom/public/idl/html/nsIDOMNSHTMLInputElement.idl,v  <--  nsIDOMNSHTMLInputElement.idl
new revision: 1.12; previous revision: 1.11
done
Checking in dom/src/base/domerr.msg;
/cvsroot/mozilla/dom/src/base/domerr.msg,v  <--  domerr.msg
new revision: 1.15; previous revision: 1.14
done
Checking in dom/src/base/nsDOMClassInfo.cpp;
/cvsroot/mozilla/dom/src/base/nsDOMClassInfo.cpp,v  <--  nsDOMClassInfo.cpp
new revision: 1.457; previous revision: 1.456
done
Checking in dom/src/base/nsDOMClassInfo.h;
/cvsroot/mozilla/dom/src/base/nsDOMClassInfo.h,v  <--  nsDOMClassInfo.h
new revision: 1.121; previous revision: 1.120
done
Checking in dom/src/base/nsDOMException.cpp;
/cvsroot/mozilla/dom/src/base/nsDOMException.cpp,v  <--  nsDOMException.cpp
new revision: 1.24; previous revision: 1.23
done
Checking in dom/src/base/nsDOMException.h;
/cvsroot/mozilla/dom/src/base/nsDOMException.h,v  <--  nsDOMException.h
new revision: 1.5; previous revision: 1.4
done
Checking in dom/src/base/nsDOMScriptObjectFactory.cpp;
/cvsroot/mozilla/dom/src/base/nsDOMScriptObjectFactory.cpp,v  <--  nsDOMScriptObjectFactory.cpp
new revision: 1.19; previous revision: 1.18
done
Checking in xpcom/base/nsError.h;
/cvsroot/mozilla/xpcom/base/nsError.h,v  <--  nsError.h
new revision: 1.52; previous revision: 1.51
done
Comment 50 Dave Camp (:dcamp) 2007-07-24 21:53:54 PDT
and xhr-binary.diff:

Checking in content/base/public/nsIXMLHttpRequest.idl;
/cvsroot/mozilla/content/base/public/nsIXMLHttpRequest.idl,v  <--  nsIXMLHttpRequest.idl
new revision: 1.33; previous revision: 1.32
done
Checking in content/base/src/nsXMLHttpRequest.cpp;
/cvsroot/mozilla/content/base/src/nsXMLHttpRequest.cpp,v  <--  nsXMLHttpRequest.cpp
new revision: 1.188; previous revision: 1.187
done
Comment 51 Dave Camp (:dcamp) 2007-07-24 22:14:20 PDT
Created attachment 273731 [details] [diff] [review]
xhr patch as committed (with iid update)
Comment 52 Takeshi Kurosawa 2007-07-25 00:35:55 PDT
Why the extension of XHR has landed without any proposals to W3C Web API Working Group.

According to the last XMLHttpReqeust Working Draft,
- http://www.w3.org/TR/2007/WD-XMLHttpRequest-20070618/#extensibility

"User agents, Working Groups and other interested parties should discuss extensions on a relevant public forum, preferably public-webapi@w3.org".

So, we should propose it then implement as per the new spec.


If we don't have enough time and cannot change the implementation along the discussion, we should add the |moz| prefix.
- http://wiki.mozilla.org/Firefox3/Schedule
Comment 53 Michael Ventnor 2007-07-26 18:36:25 PDT
I suspect this will receive similar controversy to <a ping>... can a pref be implemented to turn this off?
Comment 54 Boris Zbarsky [:bz] 2007-07-26 20:18:22 PDT
Dave, could you do an empty checkin to these files with this bug number (using cvs commit -f)?  The bug number never made it into the checkin comment...

And Takeshi is right: we should raise this with the WG.
Comment 55 :Gavin Sharp [email: gavin@gavinsharp.com] 2007-07-26 20:21:05 PDT
You can also file a mozilla.org :: CVS Administration bug to have the commit messages for those revisions updated to include the bug number.
Comment 56 Dave Camp (:dcamp) 2007-07-27 14:02:25 PDT
requested the change in bug 389892
Comment 57 Jesse Ruderman 2007-07-27 15:12:32 PDT
> I suspect this will receive similar controversy to <a ping>... can a pref
> be implemented to turn this off?

Why would this be controversial?
Comment 58 Michael Ventnor 2007-07-27 16:02:35 PDT
(In reply to comment #57)
> > I suspect this will receive similar controversy to <a ping>... can a pref
> > be implemented to turn this off?
> 
> Why would this be controversial?

I guess people may think this will provide extra opportunity for websites to find an exploit in this to arbitrarily read files.
Comment 59 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-27 18:54:23 PDT
People thinking that would be wrong. I'm not happy about bloating the code, by adding useless prefs, for PR reasons.
Comment 60 Nickolay_Ponomarev 2007-09-15 06:13:37 PDT
this is mentioned on http://developer.mozilla.org/en/docs/Firefox_3_for_developers#Offline_resources, but not documented yet.
Comment 61 georgi - hopefully not receiving bugspam 2007-09-17 08:46:59 PDT
how do i use this from web (or web offline) content?
Comment 62 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-09-19 13:44:39 PDT
<input id=f type=file>

document.getElementById('f').fileList[0].getAsText("")
document.getElementById('f').fileList[0].getAsText("utf8")
document.getElementById('f').fileList[0].getAsDataURL()
document.getElementById('f').fileList[0].getAsBinary()
Comment 63 Eric Shepherd [:sheppy] 2007-10-10 13:45:38 PDT
OK, just so I'm clear on what needs documenting here:

We need to write reference docs for:

* nsIDOMFile
* nsIDOMFileList
* nsIDOMFileException

And then a how-to article explaining how to take advantage of this capability.

Any chance I can get a volunteer to write the how-to article if I do the reference articles? :)
Comment 64 Eric Shepherd [:sheppy] 2007-10-12 08:25:39 PDT
In the IDL for nsIDOMFile, there's the comment "raises(FileException)" on retrieval.  This confuses me, since the nsIDOMFileException only appears to apply when retrieval fails.  Can someone clarify this for me?
Comment 65 Eric Shepherd [:sheppy] 2007-10-12 08:35:50 PDT
It would also be helpful if there were useful comments in the 3 IDL files associated with this patch.
Comment 66 :Gavin Sharp [email: gavin@gavinsharp.com] 2007-10-12 12:29:11 PDT
(In reply to comment #64)
> In the IDL for nsIDOMFile, there's the comment "raises(FileException)" on
> retrieval.  This confuses me, since the nsIDOMFileException only appears to
> apply when retrieval fails.  Can someone clarify this for me?

That looks like the standard DOM way of saying "if it raises an exception, it raises a FileException" (it's a common pattern in other nsIDOM* interfaces, I think it comes from the W3 IDLs). I agree that it's not very clear. Our implementations don't actually seem follow that to the letter (some uncommon failures return exceptions that are not FileExceptions), but I think that's the idea.
Comment 67 Eric Shepherd [:sheppy] 2007-10-12 14:13:51 PDT
OK, the input HTML element isn't documented on our site yet, so I put samples of how to reference the file list into the file list documentation; this stuff is now documented:

http://developer.mozilla.org/en/docs/nsIDOMFile
http://developer.mozilla.org/en/docs/nsIDOMFileList
http://developer.mozilla.org/en/docs/nsIDOMFileException

Marking this as documentation complete.
Comment 68 Olli Pettay [:smaug] (vacation Aug 25-28) 2007-10-17 07:58:59 PDT
Shouldn't this have register NS_ERROR_MODULE_DOM_FILE
using RegisterExceptionProvider in nsDOMScriptObjectFactory::nsDOMScriptObjectFactory?

Note You need to log in before you can comment on or make changes to this bug.