Closed Bug 1260606 Opened 8 years ago Closed 8 years ago

FileReader.readAsText(HTML_FORM_INPUT.files[0]) fails on content size change

Categories

(Core :: DOM: Core & HTML, defect)

45 Branch
defect
Not set
normal

Tracking

()

RESOLVED INVALID
Tracking Status
firefox45 --- affected
firefox46 - wontfix
firefox47 - wontfix
firefox48 - wontfix
firefox-esr45 - ---

People

(Reporter: BijuMailList, Assigned: baku)

References

Details

(Keywords: dev-doc-needed, regression, site-compat)

Attachments

(2 files, 1 obsolete file)

Attached file read_file.html
+++ This bug was initially created as a clone of Bug #1253491 +++
FileReader.readAsText(HTML_FORM_INPUT.files[0]) fails on content size change.
Separating the regression part from original bug.

Steps:-
1. Create a text file in notepad at path C:\moztest.txt
2. Enter text file as content "Mary had a little lamp" 
3. Save C:\moztest.txt and note the number of bytes 
4. Switch to Firefox
5. Open attachment read_file.html
6. Click browse button 
7. open file C:\moztest.txt
8. See the text loaded "Mary had a little lamp"
9. Switch to Notepad C:\moztest.txt
10. Change text "Mary" to "Jane"
11. Save C:\moztest.txt make sure byte count have not changed
12. Switch to Firefox
13. Click "Reload" button
14. See text changed to "Jane had a little lamp"
15. Switch to Notepad C:\moztest.txt
16. Change text "Jane" to "Joe"
17. Save C:\moztest.txt make sure byte count have changed
18. Switch to Firefox
19. Click Reload button

Expected 
* Text changed to "Joe had a little lamp"

Result
47.0a1 (2016-03-03) 
* New text is not reloading, 
  content still say  "Jane had a little lamp"
* If console log is checked you see "onerror" fired

 	
44.0.2
* Text changes from "Jane" to "Joe"
  and read as "Joe had a little lamp"
* But if console log is checked then it showa
  in loadFileAsTxt reader.onload
  loaded : 21
  total : 22
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=528ea05671e9bd9ccb33d1558a20691a72c85f98&tochange=a8965ae93c5d098a4f91ad9da72150bb43df07a7

Bug 1198095 - FileReader should dispatch an error if the blob changes size in the meantime the read is executed, r=bz
Blocks: 1198095
OS: Windows 7 → All
Hardware: x86 → All
Version: Trunk → 45 Branch
I see 2 approaches here:

1. Use nsINativeFileWatcherService to follow changes in the underline file of the Blob. Problem is that, currently, nsINativeFileWatcherService works on windows only. We can add the support for other platforms based on Kqueue or Fsevent or INotify but it's a big task.

2. Do I/O all the time BlobImplFile::GetSize() is called. Easy, but it means we do I/O on the main-thread.
And for IPC, we do a sync IPC request if needed from content process to parent process.

I think it's better to go for the second approach.
Posted the site compatibility doc for reference: https://www.fxsitecompat.com/en-CA/docs/2016/filereader-readastext-fails-when-file-is-updated-after-initial-reading/

[Tracking Requested - why for this release]: A site compatibility regression from Firefox 45. 1 duplicate bug reported. Too late for 45.0.2 but 45 ESR and 46+ require the fix.
Has Regression Range: --- → yes
The site compatibility doc claims that the issue arises when a previous FileReader instance is reused when reloading the File, but I see when a new FileReader instance is used. Maybe you're referring to the File instance (which is reused)?
(In reply to Andrea Marchesini (:baku) from comment #2)
> I see 2 approaches here:

1.
Can we check how Google Chrome has implemented this.

2.
Another observation, In Firefox when we do FORM submit (POST) the web server receive the latest version of the file.

3.
(sorry if i am naive)
At end of the low level OS file I/O read operation, we will be knowing actual size of the file.
So at time cant we fix 
* files[0].size;
* event.loaded;
* event.total;
* event.target.result.byteLength;

If we can change event.target.result.byteLength after created, we should think adding extra property some thing like.
* event.target.result.contentByteLength;
   To say actual size of file.
and 
* event.target.result.remainingContent;
   A buffer with the remaining content of the file.

Then propose it to the HTML5 spec.
> Can we check how Google Chrome has implemented this.

Done immediately: chrome does sync I/O.

> Another observation, In Firefox when we do FORM submit (POST) the web server
> receive the latest version of the file.

Sure

> Then propose it to the HTML5 spec.

This is interesting.

So, what I'm doing right now is this:

1. File.size does sync I/O if the File object points to a real file on the local filesystem.
This can happen only if the file has been chosen from a FilePicker.
This is going to be the first step and with this we will fix this issue.

2. Follow up: for windows we have a working nsIFileWatcherService.
I'm planning to use it (I have a 80% written patch for this). This means that for Windows we don't do sync IO all the time but we cache the size value after the first read. Then we will receive notifications from the FileWatcher service when something changes.

3. Follow up (super low priority - I guess): I want to implement a nsIFileWatcherServicer for linux too.
GTK has a GFileWatcher thing and we already have GTK as dependence for Linux and *BSD. This should work at the same way of point 2.

4. If we can cover mac too, somehow with FSEvent or KQueue, we can remove the sync I/O read completely, I hope.
A dummy performance test for .size handling is http://mozilla.pettay.fi/moztests/file.size.html
Blink is really bad with that, and I assume we'll regress that quite a bit with (1), which is acceptable for now and (2-4) will bring back the good performance.
It feels a bit bad if we have just (2) but not (3-4), and it is a tad error prone to have so different code paths for different OSes. But sure, (2) makes performance better for most of the users.
Attached patch patch - RequestSyncIO (obsolete) — Splinter Review
Assignee: nobody → amarchesini
Attachment #8738988 - Flags: review?(khuey)
Khuey, with this patch, I propose a new 'RequestSyncIO()' method for the BlobImpl class.
The basic implementations of BlobImpl return false; BlobImplFile returns true and MultipleBlobImpl returns true if one of the sub blobImpl returns true.

This is useful for e10s and RemoteBlob because they do sync GetSize and GetLastModified when needed.
Comment on attachment 8738988 [details] [diff] [review]
patch - RequestSyncIO

No review request yet. Some tests are broken.
Attachment #8738988 - Flags: review?(khuey)
(In reply to tgvaughan from comment #5)
> The site compatibility doc claims that the issue arises when a previous
> FileReader instance is reused when reloading the File, but I see when a new
> FileReader instance is used. Maybe you're referring to the File instance
> (which is reused)?

Thanks. Fixed the confusing description.
Attached patch aaa.patchSplinter Review
This patch is built on top of nsIFileWatcherService. So far it works fine on windows and on any Gtk-platform builds (with bug 1263398).

I'm also planning to write a nsIFileWatcher that does polling and file bugs for MacOSX and Android.
Attachment #8738988 - Attachment is obsolete: true
Attachment #8740082 - Flags: review?(khuey)
This would be landing very late in 46 beta. I would rather not do that unless we absolutely have to.
Should this block the release?  If we don't fix it in 46 beta, would this drive a dot release?
Flags: needinfo?(khuey)
Flags: needinfo?(amarchesini)
I have no idea, but I'm skeptical we need this on branches.  Andrea?
Flags: needinfo?(khuey)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #14)
> This would be landing very late in 46 beta. I would rather not do that
> unless we absolutely have to.
> Should this block the release?  If we don't fix it in 46 beta, would this
> drive a dot release?

Not at all. Actually Jonas things that the current behavior is correct. This is definitely not a regression.
The spec says that the File obj is immutable. This means that if the underlying file changes, the DOM File object should not be usable in any API.

I still think that the spec should be changed and this patch is important to have. But this is not related with FF 46 and so on.
Flags: needinfo?(amarchesini)
OK. Thanks for the feedback! I think we should keep this change with 48 then.
 
Comment 1 gives the regression window as 45. Not sure whether to mark this as unaffected or simply not track it.   Comment 4 also says, "45 ESR and 46+ require the fix" which is why I was asking.
Jonas, are you OK to resolve this bug as INVALID?
Flags: needinfo?(jonas)
Note that the important IO here is not the filesize. The important IO is the file contents.

It's really important that a call to FileReader.readAsText(HTML_FORM_INPUT.files[0]) doesn't produce data that contains partially the old data and partially the new data.

Likewise, the following should not result in partially old and partially new data being read, and it should not result in any gaps in the data:

var myFile = HTML_FORM_INPUT.files[0];
var chunkSize = 1024*1024;
loadChunk(0);
function loadChunk(n) {
  var nChunks = Math.ceil(myFile.size / chunkSize);
  if (n >= nChunks) {
    done();
    return;
  }
  displayProgressBar(n/nChunks);
  var fr = new FileReader();
  fr.readAsText(myFile.slice(chunkSize * n, chunkSize * (n+1)));
  fr.onload = function() {
    process(fr.result);
    loadChunk(n+1);
  }
}


Neither should


var myFile = HTML_FORM_INPUT.files[0];
var chunkSize = 1024*1024;
var nChunks = Math.ceil(myFile.size / chunkSize);
var slices = [];
for (i = 0; i < nChunks; i++) {
  slices.push(myFile.slice(chunkSize * i, chunkSize * (i+1)));
}
loadChunk(0);
function loadChunk(n) {
  if (n >= slices.length) {
    done();
    return;
  }
  displayProgressBar(n/slices.length);
  var fr = new FileReader();
  fr.readAsText(slices[n]);
  fr.onload = function() {
    process(fr.result);
    loadChunk(n+1);
  }
}


It's much preferable that these examples result in a read error, than that they result in corrupted data.


I don't see how that's achievable if we update File.size.


Ideally I think we should use nsIFileWatcherService to detect if the OS-file changed since the File object was created. And any time the file changes, fail any attempts to read from the File or any slices that has been created from it. That's more dependable than relying on the filesize changing.
Flags: needinfo?(jonas)
> Ideally I think we should use nsIFileWatcherService to detect if the OS-file
> changed since the File object was created. And any time the file changes,
> fail any attempts to read from the File or any slices that has been created
> from it. That's more dependable than relying on the filesize changing.

Indeed. Before asking you a NI I actually filed bug 1264371.
But the point of this bug is to support the second FileReader and let it read the content of the file after the file is changed.
So I guess we should mark this bug as invalid and file a new one to throw more exceptions than what we currently do :)

Definitely we should finish the work I've started, implementing nsIFileWatcherService for all the supported platform (currently we have just windows). I wrote a patch to support it using GIO if we compile FF with GTK (bug 1263398) and I have an half-written patch that does polling for other platforms. For Mac we need FSEvents and KQueue.
Yeah, i think bug 1264371 covers the remaining changes that we want here.

IIRC File.size already grabs a snapshot of the filesize when the File object is created, and then never changes no matter what. Which is exactly the behavior that we want.

So marking this as INVALID.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INVALID
Updated the site compat doc accordingly: https://github.com/fxsitecompat/www.fxsitecompat.com/commit/9996fbd

I'll keep the dev-doc-needed keyword in case MDN also requires a note on it.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #23)
> So .... can I cancel this review?

Yes. I have to change the test. I'll propose a different version of the same patch (to do exactly the opposite) in a separate bug.
Attachment #8740082 - Flags: review?(khuey)
(In reply to Kohei Yoshino [:kohei] from comment #4)
> Posted the site compatibility doc for reference:
> https://www.fxsitecompat.com/en-CA/docs/2016/filereader-readastext-fails-when-file-is-updated-after-initial-reading/
say...
"... To work around the issue, you should always use a new FileReader instance to read a file."

but new FileReader instance still gives error
This is actually wrong. Onces the underlying file is changed, any API should not be able to use it.
(In reply to Andrea Marchesini (:baku) from comment #26)
> This is actually wrong. Onces the underlying file is changed, any API should
> not be able to use it.

It is common we attach a file then user realize s/he have not saved the final changes to the document.
Do we have a work around to solve such case?

Also I dont see in the file reader error handler there is no way to identify what caused the error.
Is there a way to identify the cause?
I think we could return a different File object from HTMLInputElement.files[0] if the file changed on disk.

But we currently don't, so there's no workaround that I can think of.

Likewise, I'd be fine with us firing a special error from FileReader if the file changed on disk. But I don't think we currently do.
(In reply to Jonas Sicking (:sicking) from comment #28)
Do you want me to file RFE bugs?
created Bug 1265602 and Bug 1265603
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: