Failure to handle short read: the case of reading a JSON file into JavaScript 'String' type. ( [META] Failure to deal with short read)

RESOLVED INACTIVE

Status

()

Core
JavaScript: Standard Library
--
major
RESOLVED INACTIVE
3 years ago
2 days ago

People

(Reporter: ISHIKAWA, Chiaki, Assigned: ISHIKAWA, Chiaki)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Assignee)

Description

3 years ago
Created attachment 8614175 [details] [diff] [review]
Failed attemp to fix the issue. Somehow if I repeat the read operations I get EINTR???

This blocks bug 1170564.
This bugzilla about 

> 4. Case of reading JSON file from JavaScript.

FYI, see bug 1170564 quoted below for details about
"short read".

During the testing of C-C TB |make mozmill| test suite
with simulated short read
I noticed an error caused by the failure to handle
short read against a file named
_tests/mozmill/mozmillprofile/session.json

below my MOZOBJ /REF-OBJ-DIR/objdir-tb3/

What happened was that a simulated short read occurred.
Only the initial portion of the JSON file was returned to the C-C
TB, thus the program misbehaved since it did not have all the info in JSON file. Thus I noticed a few errors starting with
TEST-UNEXPECTED-FAIL | /REF-COMM-CENTRAL/comm-central/mail/test/mozmill/session-store/test-session-store.js | test-session-store.js::test_restore_single_3pane_persistence

NOTE; BEWARE FF developers.
Since this has something to do with JSON reading from JavaScript (found by analyzing the stack trace), this can affect FF as well, I think.

Explanation.

JSON files

One of the short read issue manifests itself when JavaScript
interpreter reads a JSON file into an array of known size (I
think it is a String). It is case (a)-1 explained in the
original META bugzilla entry.

If a |read| fails to read all the available data, the JS
interpreter doesn't care and proceed and so eventually we have
an error during |make mozmill| test because not all the
relevant data in JSON file is read!

This operation involves a JavaScript interpreter operation and
CPP I/O operation and I am not sure how to fix it: while I
tried to fix it naively, I get EINTR error(!) (See the
discussion in [1].  So There may be a case that EINTR needed
to be deal with for real after all!)

With my naive fix, the program got into an infinite loop :-(
This is hard to fix and so I am submitting this as a separate
entry to ask for wider attention.

[1] Bug 1126881 - Mapping of errno to NS_* macros
(nsresultForErrno(int aErr)) is not as complete as mapping to
PR_* macros


++ This bug was initially created as a clone of Bug #1170564 +++

After spending some time to weed out the failure to detect I/O
errors from C-C TB code during message download [1], I have
come to realize there are issues regarding short read.

Short read refers to |read| system call returning without the
full number of requested octets although the subsequent |read|
calls would return the missing octets. That is, short read
returns prematurely.

Short read can happen during a read operation against a
remotely mounted file. High-load on the remote server, or a
transient network issue can cause such short read.

It is not a stranger to FF developers. Network code is full of
cautious programming to take care of such short read.

Unfortunately, C-C TB (under linux at least) fails to handle
the short read very well and as a result we see errors which I
noticed during tests.

So I set out to identify and fix these issues and this is a
meta entry to track such errors.

It turns out that performing additional writes after a
short-write is automagically handled by |pt_Write|. Isn't it
great?

Then why not pt_Read? As I explain later in this post, we
sometimes don't know how many we should read in
advance. That's why.

Short read simulated:

I have come to learn that mounting one's profile to remote
file server seems to cause strange errors.  Since I don't have
a CIFS server that seems to cause such short read very often
under heavy I/O work load, I resort to simulation.

I inject short read by preparing some functions such as |read|
and |read64| that simulates short read and preload them using

"export LD_PRELOAD=that-file.so" 

before running C-C thunderbird to find errors caused by short
read.


How to FIX after identifying the trouble spot:

When you deal with short read issues there are a few things
that we must keep in mind.

At the level of |read| system call, we have no idea whether a
short read ought to be followed by additional |read| or not.

Then where should we make the decision?  It depends on the
type of |read| operation.

There are two types of |read| operations.

(a) # of octets to read is known in advance.

    The caller of |read| knows exactly that there are certain
    number of octets to be read in advance.

    E.g. 

     (1) we check the size of JSON file by calling |stat| in
         advance and try to read that many octets in one
         |read|.  [As it turns out this is a real-world
         problem.]
    
     (2) we try to read a "block" of a known data structure
         and sizeof(block) is already known in advance (at
         compile time, etc).
         A case in point is the "block" for CACHE operation.
	 (There was an issue and it was fixed.)

In these cases, we ought to perform additional |read|
operations until all the expected number of octets are read
(or until error condition or EOF is encountered.) in the face
of short read. So we must provide a mechanism for such
additional reads.

In my patch, I have introduced a couple of variants of |Read|
operation and |PR_Read| that performs additional read
operations if we failed to read the desired # of octets in one
run until all the octets are read.


(b)  # of octets to read is not known in advance.

     E.g.,

     (1) There is a mime-encoded data in a file (as in
     attachment), and the program needs to read as many data
     as possible until the end of the data marker line is
     reached.

     (2) In general, suppose there is a data structure written
     in ASCII representation in a file, and also suppose the
     caller of |read| can know the end of the logical data
     structure by reading a certain textual construct that
     signals the end of the data. In this case, the caller of
     |read| needs to perform the |read| repeatedly until this
     end of data marker is read.
     Case (1) is obviously a particular case of (2).

     (3) There are similar variants.

In these cases, we can NOT rely on the number of octets to
read in advance. The size of the buffer passed to |read| is
just an estimate, a ball park figure. We may have more data to
read or much less data to read actually.  We ought to perform
additional |read| until the end of data marker is reached (or
the error is signaled or EOF is reached.)

Between (a) and (b), the main difference is case (a) knows the
fixed number of octets to be read in advance, and case (b)
does not know it.

In practice, |read| calls of case (b) tend to happen inside
loops and are relatively easy to spot, and no modification is
necessary most of the time even if short read has to be
considered.  (Well there are cases where the ERROR condition
was not detected at all in the current code and I tried to fix
them.)

The short read in case (a) needs to be handled at the spot
where the required number of octets is known in the call
chain.  At the |read| system call level, we have no idea
whether a particular short |read| ought to be followed by
additional reads at all.

To figure out where such problematic short read is
encountered, and to learn whether the |read| is type (a) or
type (b) above, I had to dump stack trace to figure out where
such short read is NOT followed by necessary additional read
operations. Using the stack trace, I tried to figure out where
the decision to perform additional reads can be made and, if
necessary, replaced the read with a version that performs such
additional read operations.

I think I explained enough background.

There will be four bugzilla entries I intend to submit today and
block this meta entriy.

1. NSS "dbm" code used for certificate and key handling.
   (Already reported in newsgroup/mailing list.)

2. Mostly C-C T-B fixes to take care of short read.

3. A few M-C fixes.

   Yes, there *ARE* places where short read is not handled in
   M-C!  I was surprised myself.

   Maybe storing one's profile under remote file server was an
   arcane idea when mozilla code base was written years ago.
   But today, in corporate setup, PC is often used like a thin
   client and everything including your profile may be stored
   on a central server.

4. Case of reading JSON file from JavaScript.

There is a known bug which I don't know how to fix.  It is
filed as a separate bugzilla. It is concerned with a reading
of JSON file from JavaScript.

There are a few more unresolved cases that manifest themselves
during local |make mozmill| test suite operation with
simulated short read injection, but I am not sure if they are real and not timing-dependent random errors that are seen before. [Yes,
there are such random errors in |make mozmill|.]

I will add them to known C-C T-B fixes and M-C fixes *iff*
they turn out to be real bugs in the future.

TIA

[1] Bug 1121842 - [META] RFC: C-C Thunderbird - Cleaning of
incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp
and friends.
(Assignee)

Updated

3 years ago
No longer depends on: 1170564, 1170578
(Assignee)

Updated

3 years ago
Assignee: nobody → ishikawa
(Assignee)

Updated

3 years ago
No longer blocks: 1170606
(Assignee)

Updated

3 years ago
No longer blocks: 1170646
(Assignee)

Updated

3 years ago
Blocks: 1170564
(Assignee)

Updated

3 years ago
See Also: → bug 1170578
(Assignee)

Updated

3 years ago
See Also: → bug 1170606
(Assignee)

Updated

3 years ago
See Also: → bug 1170646
(Assignee)

Comment 1

3 years ago
If I put #if 1 instead of #if 0 in the uploaded patch,
I get EINTR errors repeatedly.

The |busy_beaver_Stream_Read| in the patch is indeed a very quick hack and there may
be an issue when it needs to work with NS_FillArray semantics?

Anyway, this is the first time I see EINTR happening for real.

TIA

Comment 2

2 days ago
Per policy at https://wiki.mozilla.org/Bug_Triage/Projects/Bug_Handling/Bug_Husbandry#Inactive_Bugs. If this bug is not an enhancement request or a bug not present in a supported release of Firefox, then it may be reopened.
Status: NEW → RESOLVED
Last Resolved: 2 days ago
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.