Closed Bug 1408397 Opened 2 years ago Closed 2 years ago

Unexpected file reader (blobs obtained with .slice()) read errors ("File could not be read" - NS_ERROR_DOM_FILE_NOT_READABLE_ERR)

Categories

(Core :: DOM: File, defect, major)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox-esr52 --- unaffected
firefox56 --- wontfix
firefox57 + fixed
firefox58 + fixed

People

(Reporter: mayhemer, Assigned: baku)

References

(Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(1 file)

Windows 10, up to date Nightly (x64), clean profile

STR:
- download and unpack https://janbambas.cz/dwnlds/js-perf-test-logs.tar.xz
- go to https://janbambas.cz/moz/logan/
- drop the single "mozilla-FX2o.log" on the [ Browse... ] button (or use any other way to load it)

actual:
"File could not be read" coming from FileReader.onerror at [1]

expected:
file slices are all read (the load progress finishes with 100%) and the log can be searched


I can reproduce this locally and also have external reports in bug 1391710 about this.

I can find the regression range, if needed, since this used to work quite recently.


[1] https://github.com/mayhemer/logan/blob/10e9f97a0e4339550bc892854b4e7ea4d69794f3/logan.js#L862
And a note that the read error most often happens with the second slice but sometimes with any of the later processed one.  It's somewhat intermittent, but error on some of the processed slices always happens for me sooner or later.
Could not repro in a debug build so far...
I've seen this error code spit out recently in some intermittent test timeouts.  Is it possible httpd.js uses this feature and his hitting the same problem?
Looks like the child process hits the error here:

 	xul.dll!mozilla::dom::FileReader::FreeDataAndDispatchError() Line 574	C++
>	xul.dll!mozilla::dom::FileReader::OnLoadEnd(NS_OK) Line 717	C++
 	xul.dll!mozilla::dom::FileReader::OnInputStreamReady(0x0000021683d52550) Line 672	C++
 	xul.dll!mozilla::dom::`anonymous namespace'::InputStreamCallbackRunnable::Run() Line 51	C++
 	xul.dll!mozilla::SchedulerGroup::Runnable::Run() Line 396	C++
 	xul.dll!nsThread::ProcessNextEvent(, 0x000000c2033ff3e0) Line 1038	C++
 	xul.dll!NS_ProcessNextEvent(, false) Line 524	C++

https://dxr.mozilla.org/mozilla-central/rev/bc7a5be76b723cf6aac1a919156e74997c5f4902/dom/file/FileReader.cpp#715

Where mDataLen = 5238784, mTotal = 5242880, difference is exactly 4096 bytes.  Note that the file on disk has NOT been touched at all.

From some reason the IPCBlobInputStream's mAsyncRemoteStream either doesn't advertise all data via Available() or ReadSegments doesn't read all the data and one of them reports NS_BASE_STREAM_CLOSED too soon.  In an optimized build it's hard to find out which one it is.

Quickly checking how the chunks are coming in, apparently aStream->Available(&count); in FileReader::OnInputStreamReady returns various result from 16k to 56k and 64k (mults of 4k).  

This is definitely a race condition somewhere inside or around IPCBlobInputStream.
Flags: needinfo?(amarchesini)
> I can find the regression range, if needed, since this used to work quite
> recently.

Yes, this will be useful. I cannot reproduce it locally (linux) and I don't have a windows10 machine at the moment.

Note that mTotal is the blob size. mDataLen is what has been read from the inputStream. Nothing related with ::Available().
Flags: needinfo?(amarchesini) → needinfo?(honzab.moz)
I can reproduce it. No needs for the range.
Flags: needinfo?(honzab.moz)
Attached patch fix.patchSplinter Review
The bug here is that we create a bufferInputStream but we don't use it as async InputStream. This can trigger the case where the buffer has some data, but the async inputStream is completed.

We don't really need the buffer input Stream here. Let's remove it.
Assignee: nobody → amarchesini
Attachment #8918849 - Flags: review?(bugs)
Which bug regressed this? We need that in order to know which branches need the fix.
Bug 1353629 part 8.
[Tracking Requested - why for this release]:
We don't want data loss issues in FF57.
Looks like that landed in FF55.
(In reply to Andrea Marchesini [:baku] from comment #7)
> We don't really need the buffer input Stream here. 
Why? The comment there says it is needed.
(In reply to Olli Pettay [:smaug] (work week Oct 16-20) from comment #12)
> (In reply to Andrea Marchesini [:baku] from comment #7)
> > We don't really need the buffer input Stream here. 
> Why? The comment there says it is needed.

We are using the buffered InputStream just to call ReadSegments(). But we can easily implement the same logic using ::Read().
ReadSegments() is more efficient than Read(), though.  You can detect if ReadSegments() is supported dynamically with NS_InputStreamIsBuffered().  This would let you consume the stream with ReadSegments if supported, and fall back to the slower Read() if not.
> ReadSegments() is more efficient than Read(), though.  You can detect if

I don't see why. The idea behind ReadSegments is that you don't increment the reading point if the writing operation (the callback) returns an error. Here the writer callback doesn't fail and we don't need the feature of reading the same buffer again.

Currently, the inputStream of a Blob can be:
. a File InputStream - no ReadSegments (non-e10s only)
. a IPCBlobInputStream - here it's often a pipe and Read() of a pipe calls ReadSegments() internally.
. a MultiplexInputStream - it calls Read/ReadSegmnet() of the sub streams
. a String InputStream - Read() uses ReadSegments()
Comment on attachment 8918849 [details] [diff] [review]
fix.patch

+    char16_t *dest = nullptr;
Nit, * goes with the type
Attachment #8918849 - Flags: review?(bugs) → review+
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/61118e8a62af
FileReader should not use a bufferStream together with an async inputStream, r=smaug
(In reply to Andrea Marchesini [:baku] from comment #15)
> > ReadSegments() is more efficient than Read(), though.  You can detect if
> 
> I don't see why. The idea behind ReadSegments is that you don't increment
> the reading point if the writing operation (the callback) returns an error.
> Here the writer callback doesn't fail and we don't need the feature of
> reading the same buffer again.

No.  The point of ReadSegments() is to avoid this:

  char tmpBuffer[4096];

You are introducing an extra copy here that is unnecessary with many streams!
Please file a follow-up to use ReadSegments() here when available to avoid the extra copy to tmpBuffer.  Or can you explain why ReadSegments() here would perform the same number of copies?
Flags: needinfo?(amarchesini)
That is very true. I asked that on IRC but didn't then think of it more. Yes, we need to convert from char to char16_t, and the patch does make extra copy without need.
Blocks: 1409394
Flags: needinfo?(amarchesini)
Blocks: 1391710
https://hg.mozilla.org/mozilla-central/rev/61118e8a62af
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Please request Beta approval on this when you get a chance.
Flags: needinfo?(amarchesini)
Comment on attachment 8918849 [details] [diff] [review]
fix.patch

Approval Request Comment
[Feature/Bug causing the regression]: IPCBlob refactoring
[User impact if declined]: data lost when reading a binary string from a FileReader
[Is this code covered by automated tests?]: it fixes some intermittent failure. So yes.
[Has the fix been verified in Nightly?]: yes.
[Needs manual test from QE? If yes, steps to reproduce]: See the description of the bug.
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: low
[Why is the change risky/not risky?]: We don't use a bufferedInputStream in FileReader but, instead, we do a read(). Nothing special.
[String changes made/needed]: none
Flags: needinfo?(amarchesini)
Attachment #8918849 - Flags: approval-mozilla-beta?
Comment on attachment 8918849 [details] [diff] [review]
fix.patch

Fixes a recent (since 55) data loss issue, Beta57+
Attachment #8918849 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.