1408397 - Unexpected file reader (blobs obtained with .slice()) read errors ("File could not be read" - NS_ERROR_DOM_FILE_NOT_READABLE_ERR)

Reporter

Description

•

7 years ago

Windows 10, up to date Nightly (x64), clean profile

STR:
- download and unpack https://janbambas.cz/dwnlds/js-perf-test-logs.tar.xz
- go to https://janbambas.cz/moz/logan/
- drop the single "mozilla-FX2o.log" on the [ Browse... ] button (or use any other way to load it)

actual:
"File could not be read" coming from FileReader.onerror at [1]

expected:
file slices are all read (the load progress finishes with 100%) and the log can be searched


I can reproduce this locally and also have external reports in bug 1391710 about this.

I can find the regression range, if needed, since this used to work quite recently.


[1] https://github.com/mayhemer/logan/blob/10e9f97a0e4339550bc892854b4e7ea4d69794f3/logan.js#L862

Honza Bambas (:mayhemer)

Reporter

Comment 1

•

7 years ago

And a note that the read error most often happens with the second slice but sometimes with any of the later processed one.  It's somewhat intermittent, but error on some of the processed slices always happens for me sooner or later.

Honza Bambas (:mayhemer)

Reporter

Comment 2

•

7 years ago

Could not repro in a debug build so far...

Ben Kelly [:bkelly, not reviewing]

Comment 3

•

7 years ago

I've seen this error code spit out recently in some intermittent test timeouts.  Is it possible httpd.js uses this feature and his hitting the same problem?

Honza Bambas (:mayhemer)

Reporter

Comment 4

•

7 years ago

Looks like the child process hits the error here:

 	xul.dll!mozilla::dom::FileReader::FreeDataAndDispatchError() Line 574	C++
>	xul.dll!mozilla::dom::FileReader::OnLoadEnd(NS_OK) Line 717	C++
 	xul.dll!mozilla::dom::FileReader::OnInputStreamReady(0x0000021683d52550) Line 672	C++
 	xul.dll!mozilla::dom::`anonymous namespace'::InputStreamCallbackRunnable::Run() Line 51	C++
 	xul.dll!mozilla::SchedulerGroup::Runnable::Run() Line 396	C++
 	xul.dll!nsThread::ProcessNextEvent(, 0x000000c2033ff3e0) Line 1038	C++
 	xul.dll!NS_ProcessNextEvent(, false) Line 524	C++

https://dxr.mozilla.org/mozilla-central/rev/bc7a5be76b723cf6aac1a919156e74997c5f4902/dom/file/FileReader.cpp#715

Where mDataLen = 5238784, mTotal = 5242880, difference is exactly 4096 bytes.  Note that the file on disk has NOT been touched at all.

From some reason the IPCBlobInputStream's mAsyncRemoteStream either doesn't advertise all data via Available() or ReadSegments doesn't read all the data and one of them reports NS_BASE_STREAM_CLOSED too soon.  In an optimized build it's hard to find out which one it is.

Quickly checking how the chunks are coming in, apparently aStream->Available(&count); in FileReader::OnInputStreamReady returns various result from 16k to 56k and 64k (mults of 4k).  

This is definitely a race condition somewhere inside or around IPCBlobInputStream.

Ben Kelly [:bkelly, not reviewing]

Updated

•

7 years ago

Flags: needinfo?(amarchesini)

Andrea Marchesini [:baku]

Assignee

Comment 5

•

7 years ago

> I can find the regression range, if needed, since this used to work quite
> recently.

Yes, this will be useful. I cannot reproduce it locally (linux) and I don't have a windows10 machine at the moment.

Note that mTotal is the blob size. mDataLen is what has been read from the inputStream. Nothing related with ::Available().

Flags: needinfo?(amarchesini) → needinfo?(honzab.moz)

Andrea Marchesini [:baku]

Assignee

Comment 6

•

7 years ago

I can reproduce it. No needs for the range.

Flags: needinfo?(honzab.moz)

Andrea Marchesini [:baku]

Assignee

Comment 7

•

7 years ago

Attached patch fix.patch — Details — Splinter Review

The bug here is that we create a bufferInputStream but we don't use it as async InputStream. This can trigger the case where the buffer has some data, but the async inputStream is completed.

We don't really need the buffer input Stream here. Let's remove it.

Assignee: nobody → amarchesini

Attachment #8918849 - Flags: review?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 8

•

7 years ago

Which bug regressed this? We need that in order to know which branches need the fix.

Andrea Marchesini [:baku]

Assignee

Comment 9

•

7 years ago

Bug 1353629 part 8.

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

7 years ago

Blocks: 1353629

Keywords: regressionwindow-wanted

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 10

•

7 years ago

[Tracking Requested - why for this release]:
We don't want data loss issues in FF57.

status-firefox57: --- → affected

tracking-firefox57: --- → ?

tracking-firefox58: --- → ?

Ben Kelly [:bkelly, not reviewing]

Comment 11

•

7 years ago

Looks like that landed in FF55.

status-firefox56: --- → wontfix

status-firefox-esr52: --- → unaffected

Julien Cristau [:jcristau] (back April 22)

Updated

•

7 years ago

tracking-firefox57: ? → +

tracking-firefox58: ? → +

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 12

•

7 years ago

(In reply to Andrea Marchesini [:baku] from comment #7)
> We don't really need the buffer input Stream here. 
Why? The comment there says it is needed.

Andrea Marchesini [:baku]

Assignee

Comment 13

•

7 years ago

(In reply to Olli Pettay [:smaug] (work week Oct 16-20) from comment #12)
> (In reply to Andrea Marchesini [:baku] from comment #7)
> > We don't really need the buffer input Stream here. 
> Why? The comment there says it is needed.

We are using the buffered InputStream just to call ReadSegments(). But we can easily implement the same logic using ::Read().

Ben Kelly [:bkelly, not reviewing]

Comment 14

•

7 years ago

ReadSegments() is more efficient than Read(), though.  You can detect if ReadSegments() is supported dynamically with NS_InputStreamIsBuffered().  This would let you consume the stream with ReadSegments if supported, and fall back to the slower Read() if not.

Andrea Marchesini [:baku]

Assignee

Comment 15

•

7 years ago

> ReadSegments() is more efficient than Read(), though.  You can detect if

I don't see why. The idea behind ReadSegments is that you don't increment the reading point if the writing operation (the callback) returns an error. Here the writer callback doesn't fail and we don't need the feature of reading the same buffer again.

Currently, the inputStream of a Blob can be:
. a File InputStream - no ReadSegments (non-e10s only)
. a IPCBlobInputStream - here it's often a pipe and Read() of a pipe calls ReadSegments() internally.
. a MultiplexInputStream - it calls Read/ReadSegmnet() of the sub streams
. a String InputStream - Read() uses ReadSegments()

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 16

•

7 years ago

Comment on attachment 8918849 [details] [diff] [review]
fix.patch

+    char16_t *dest = nullptr;
Nit, * goes with the type

Attachment #8918849 - Flags: review?(bugs) → review+

Pulsebot

Comment 17

•

7 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/61118e8a62af
FileReader should not use a bufferStream together with an async inputStream, r=smaug

Ben Kelly [:bkelly, not reviewing]

Comment 18

•

7 years ago

(In reply to Andrea Marchesini [:baku] from comment #15)
> > ReadSegments() is more efficient than Read(), though.  You can detect if
> 
> I don't see why. The idea behind ReadSegments is that you don't increment
> the reading point if the writing operation (the callback) returns an error.
> Here the writer callback doesn't fail and we don't need the feature of
> reading the same buffer again.

No.  The point of ReadSegments() is to avoid this:

  char tmpBuffer[4096];

You are introducing an extra copy here that is unnecessary with many streams!

Ben Kelly [:bkelly, not reviewing]

Comment 19

•

7 years ago

Please file a follow-up to use ReadSegments() here when available to avoid the extra copy to tmpBuffer.  Or can you explain why ReadSegments() here would perform the same number of copies?

Flags: needinfo?(amarchesini)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 20

•

7 years ago

That is very true. I asked that on IRC but didn't then think of it more. Yes, we need to convert from char to char16_t, and the patch does make extra copy without need.

Andrea Marchesini [:baku]

Assignee

Updated

•

7 years ago

Blocks: 1409394

Andrea Marchesini [:baku]

Assignee

Updated

•

7 years ago

Flags: needinfo?(amarchesini)

Honza Bambas (:mayhemer)

Reporter

Updated

•

7 years ago

Blocks: 1391710

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 21

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/61118e8a62af

Status: NEW → RESOLVED

Closed: 7 years ago

status-firefox58: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla58

Ryan VanderMeulen [:RyanVM]

Comment 22

•

7 years ago

Please request Beta approval on this when you get a chance.

Flags: needinfo?(amarchesini)

Andrea Marchesini [:baku]

Assignee

Comment 23

•

7 years ago

Comment on attachment 8918849 [details] [diff] [review]
fix.patch

Approval Request Comment
[Feature/Bug causing the regression]: IPCBlob refactoring
[User impact if declined]: data lost when reading a binary string from a FileReader
[Is this code covered by automated tests?]: it fixes some intermittent failure. So yes.
[Has the fix been verified in Nightly?]: yes.
[Needs manual test from QE? If yes, steps to reproduce]: See the description of the bug.
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: low
[Why is the change risky/not risky?]: We don't use a bufferedInputStream in FileReader but, instead, we do a read(). Nothing special.
[String changes made/needed]: none

Flags: needinfo?(amarchesini)

Attachment #8918849 - Flags: approval-mozilla-beta?

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 24

•

7 years ago

Comment on attachment 8918849 [details] [diff] [review]
fix.patch

Fixes a recent (since 55) data loss issue, Beta57+

Attachment #8918849 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Ryan VanderMeulen [:RyanVM]

Comment 25

•

7 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-beta/rev/7819bf87d650

status-firefox57: affected → fixed

Daniel Veditz [:dveditz]

Updated

•

1 year ago

Regressions: CVE-2023-32213