Failure to handle short read: the case of reading a JSON file into JavaScript 'String' type. (failure to deal with short read)
Categories
(Core :: XPCOM, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox105 | --- | fixed |
People
(Reporter: ishikawa, Assigned: nika)
References
(Blocks 2 open bugs)
Details
Attachments
(2 files)
3.01 KB,
patch
|
Details | Diff | Splinter Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
Reporter | ||
Updated•9 years ago
|
Reporter | ||
Updated•9 years ago
|
Reporter | ||
Comment 1•9 years ago
|
||
Updated•6 years ago
|
Comment 2•5 years ago
|
||
The meta keyword is there, the bug doesn't depend on other bugs and there is no activity for 12 months.
:froydnj, maybe it's time to close this bug?
Comment 3•5 years ago
|
||
AFAIK, this bug still applies. Ishikawa-san, is that correct?
Updated•5 years ago
|
Reporter | ||
Comment 4•5 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #3)
AFAIK, this bug still applies. Ishikawa-san, is that correct?
Yes, I believe so.
My plan is
-
- to land the POP3 fix to reduce the # of system calls (write/seek done for each line today).
E.G. A photo attachment (in MIME format) causes easily 40K system calls... Kills the
performance of a File server if the profile is on a remote file system.
I have seen more than x10 write speed gain under local test. I mean copying about 1000 messages took more than two minutes while with my patch finished under 5 seconds in a test.
- to land the POP3 fix to reduce the # of system calls (write/seek done for each line today).
-
- to land the series of short-read fixes mentioned here and elsewhere.
Trying to finish the fiscal-year-end hassle at the office, AND cope with the corona virus outbreak in Japan.
Comment hidden (off-topic) |
Assignee | ||
Comment 6•2 years ago
|
||
The NS_FillArray
function definitely only reads a single time from the underlying input stream, so can read less than the full capacity of aDest
without reaching EOF. That being said, the signature of the function does describe its behavior that way, so it doesn't look like it's necessarily wrong unless the callers aren't handling it correctly.
Looking at the tree right now, the only caller appears to be nsConverterInputStream
, which calls it as part of the Fill
method (https://searchfox.org/mozilla-central/rev/7b9d23ece4835bf355e5195f30fef942d376a1c7/intl/uconv/nsConverterInputStream.cpp#189). This method does appear to have a slight bug when reading unicode data with short reads, though I don't know for sure if it's specifically the cause of the issue mentioned in comment 0.
Specifically, the call will read at least 1 byte from the source stream, but 1 byte is often not enough data to decode a full utf-16 character for whichever encoding is used as the source encoding. This could mean that the value of written
returned from the decoder (https://searchfox.org/mozilla-central/rev/7b9d23ece4835bf355e5195f30fef942d376a1c7/intl/uconv/nsConverterInputStream.cpp#212-221) is zero, which would lead to a NS_OK
code with 0 bytes written being returned despite the number of bytes filled being 0, leading to the stream being terminated early.
I think the code in nsConverterInputStream
should instead call NS_FillArray
and try to decode in a loop until either at least 1 utf-16 character has been decoded from the source stream, or the NS_FillArray
call returns an error.
Assignee | ||
Comment 7•2 years ago
|
||
This patch changes how nsConverterInputStream handles passing data
through to the underlying unicode converter in order to make it more
reliably handle propagating errors and deal with short reads from the
underlying input stream.
This was done by making the code continuously read within the Fill
method until at least one character has been decoded from the input
stream, so that we don't spuriously communicate an EOF to the caller due
to a short read not producing enough bytes for the decoder to produce a
UTF-16 character.
In addition, while making this change it became easier to signal to
the decoder about the final read from the input stream, meaning that
partial characters at the end of the stream will now generate a
replacement character, rather than being ignored.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 8•2 years ago
|
||
The following patch is waiting for review from an inactive reviewer:
ID | Title | Author | Reviewer Status |
---|---|---|---|
D152682 | Bug 1170668 - Improve short read handling in nsConverterInputStream, r=hsivonen | nika | hsivonen: Back Aug 7, 2022 |
:nika, could you please find another reviewer?
For more information, please visit auto_nag documentation.
Reporter | ||
Comment 9•2 years ago
|
||
I am glad to see this finally handled by someone in the know.
Thank you!
Assignee | ||
Comment 10•2 years ago
|
||
We're about a week out from :hsivonen being back, and it'd be nice if they could look over the use of the encoding API to make sure I didn't screw anything up, so leaving the reviewer as-is for now.
Comment 11•2 years ago
|
||
Comment 12•2 years ago
•
|
||
Backed out for causing build bustages on TestShortRead.cpp
Comment 13•2 years ago
|
||
Assignee | ||
Updated•2 years ago
|
Comment 14•2 years ago
|
||
bugherder |
Description
•