Open Bug 1126881 Opened 9 years ago Updated 2 years ago

Mapping of errno to NS_* macros (nsresultForErrno(int aErr)) is not as complete as mapping to PR_* macros

Categories

(Core :: XPCOM, defect)

All
Linux
defect

Tracking

()

People

(Reporter: ishikawa, Assigned: ishikawa, NeedInfo)

Details

Attachments

(1 file)

Mozilla software needs to map underlying OS errors to an
application-level error code.  As far as I know, there are NS_* error
macros and PR_* error macros.

We need to make sure that both class of macros are mapped so that
routines that use libraries that return NS_* error values and routines
that libraries that return PR_* error values can learn what the
underlying OS or native library routines return.

Now, I have noticed that there is a serious discrepancy of the number
of different errors that are mapped by the following two functions to
be used under POSIX (UNIX-like, i.e., linux and apple macos x).

(1) Mapping errno -> NS_*   (incomplete as opposed to the case 2 below)

inline nsresult
nsresultForErrno(int aErr)
http://mxr.mozilla.org/comm-central/source/mozilla/xpcom/io/nsLocalFile.h#51

(2) mapping errno -> PR_*
http://mxr.mozilla.org/comm-central/source/mozilla/nsprpub/pr/src/md/unix/unix_errors.c

The case (2) is interesting.
There are different routines: one is to map the basic errors
like nsresultForErrno() above, but
there are routines that try to map errors under a particular situation
to more meaningful PR_* macro values.

For example, ETIMEDOUT is mapped to PR_REMOTE_FILE_ERROR when it is encountered
during close() operation, I think. This mapping makes much sense.

http://mxr.mozilla.org/comm-central/source/mozilla/nsprpub/pr/src/md/unix/unix_errors.c#451
451 void _MD_unix_map_close_error(int err)
452 {
453     PRErrorCode prError;
454 
455     switch (err) {
456         case ETIMEDOUT:
457             prError = PR_REMOTE_FILE_ERROR;
458             break;
459         default:
460             _MD_unix_map_default_error(err);
461             return;
462     }
463     PR_SetError(prError, err);
464 }
465

Even for the simplest default mapping |_MD_unix_map_default_error| [1],
which is essentially equivalent of the function, nsresultForErrno in
(1), the difference is huge.  While the function nsresultForErrno()
handles only about 15 errno values, the latter handles about 60, four
times as many (!).

We should make sure *more complete* proper error mapping is done from
errno to NS_* error macros.

This can be possibly done in steps.

step-1: Make nsresultForErrno() map as many errno values as
|_MD_unix_map_default_error|.

At this stage, it may be necessary to introduce a new NS_* error
macros. I have no idea.

step-2: We may want to introduce functions similar, but not exactly
the same to _MD_unix_map_close_error(int err), so that an errno value
can be mapped to proper NS_* macro with narrower meaning.  These can
be used by library routines to return exact error meaning rather than
generic NS_* error values.
Again, new NS_* macros may be necessary. I have no idea.

Background: 
I noticed the discrepany while I have been trying to fix the lack of
proper error checking in pop3 and imap download routines in
thunderbird. I was comparing some fixes I did to
CopyToNative about a year ago [2], and wondered why EHOSTDOWN which I
experienced during the failure of remote CIFS-mout is not
in NSRESULT_FOR_ERRNO() that uses nsresultForErrno(errno).
http://mxr.mozilla.org/comm-central/source/mozilla/xpcom/io/nsLocalFile.h#119

Then I realized that CopyToNative() uses functions that return d PR_* type of errors
while
pop3 and imap download deal with functions that return NS_* type of
errors. 

I now suspect that one of the reasons why there is wide lack of proper
error checking for I/O in thunderbird, especially message handling, is
that proper mapping OS error code was not available for library
routines that return NS_* error values on top of programmer laziness,
for lack of better words.

The failure to perform error checking in I/O operations is a serious
matter for Thunderbird (and I suspect it is for firefox, too, although
the amount of data WRITTEN is way over larger for TB, and so TB faces
more danger of ignoring write errors silently.)

I have no idea whether the mozilla software on top of firefox OS uses
NSPR libraries, but if so such software needs proper error mapping,
too!

This deficiency of error mapping function for NS_* macros should be fixed one way or the other.

TIA

[1] http://mxr.mozilla.org/comm-central/source/mozilla/nsprpub/pr/src/md/unix/unix_errors.c#12

[2] Bug 936987 - Catch and propagate PR_Close() error in CopyToNative() 

cf. 
Bug 931703 - Adding ENOSPC and a few other POSIX error macros to
nsresultForErrno()

At the time of bug 931703, I only requested ENOSPC.  In hindsight, I
should insist on mapping all the default POSIX errors and some more!
Assignee: nobody → ishikawa
Here is the more comprehensive mapping of errno values to NS_* macros.
I had to come up with a few new NS_ERROR* symbols to map the POSIX errno values.

Tested under linux.

Since I touched  nsLocalFile.h back in
Bug 931703 - Add unhandled/missing errno for NSRESULT_FOR_ERRNO macros. and I got a review from bsmedberg, I obtained review from bsmedberg here.

TIA
Attachment #8604452 - Flags: review?(benjamin)
(In reply to ISHIKAWA, Chiaki from comment #1)
> Created attachment 8604452 [details] [diff] [review]
> More comprehensive mapping of errno to NS_* error macros
> 
> Here is the more comprehensive mapping of errno values to NS_* macros.
> I had to come up with a few new NS_ERROR* symbols to map the POSIX errno
> values.
> 
> Tested under linux.
> 
> Since I touched  nsLocalFile.h back in
> Bug 931703 - Add unhandled/missing errno for NSRESULT_FOR_ERRNO macros. and
> I got a review from bsmedberg, I obtained review from bsmedberg here.
> 
> TIA

I forgot to mention.

Mapping of errno to PR_* error macros are more complete than the mapping to NS_*.

However, I noticed a few misguided mapping cases om PR_* mappings. That is, the mapping is clearly incorrect in the following cases.

           case EMLINK:
111             prError = PR_MAX_DIRECTORY_ENTRIES_ERROR;

           case ENOLCK:
145             prError = PR_FILE_IS_LOCKED_ERROR;

184 #ifdef EOVERFLOW
          case EOVERFLOW:
186             prError = PR_BUFFER_OVERFLOW_ERROR;
187             break;
188 #endif

            case ERANGE:
207             prError = PR_INVALID_METHOD_ERROR;
208             break;

There are other dubious cases (judgment calls), but I let them pass.

Had I known how to change the PR_* value mappings (say, introducing better error codes)
I would have done it.
But prerr.h states it is automatically generated. So I think there is
a mechanism to create this file.
But I don't know how.
http://mxr.mozilla.org/comm-central/source/mozilla/nsprpub/pr/include/prerr.h

 9 /*
10  *
11  * prerr.h
12  * This file is automatically generated; please do not edit it.
13  */
14 
15 /* Memory allocation attempt failed */
16 #define PR_OUT_OF_MEMORY_ERROR                   (-6000L)

Anyway, although the mapping is semantically incorrect, we seem to have a unique mapping, and so the changes to PR_* can wait. I also checked the usage and
it seems that no code depends on the particular value of the error code from my quick check. (I could be wrong.)
Anyway, I should file a separate bugzilla for PR_* mapping errors, I think.

TIA
(In reply to ISHIKAWA, Chiaki from comment #1)

> I got a review from bsmedberg, I obtained review from bsmedberg here.

Ouch, I meant to say, I "request" review from bsmedberg here.

TIA
Comment on attachment 8604452 [details] [diff] [review]
More comprehensive mapping of errno to NS_* error macros

It seems unlikely to me that we want to introduce new nsresult codes unless somebody is going to use them. EINTR handling in particular probably shouldn't ever escape the underlying I/O system, since there is no equivalent on windows.
Attachment #8604452 - Flags: review?(benjamin) → review?(nfroyd)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #4)
> Comment on attachment 8604452 [details] [diff] [review]
> More comprehensive mapping of errno to NS_* error macros
> 
> It seems unlikely to me that we want to introduce new nsresult codes unless
> somebody is going to use them. EINTR handling in particular probably
> shouldn't ever escape the underlying I/O system, since there is no
> equivalent on windows.

I won't be doing this unless I need to refer to the error codes.

I intend to refer to some errors in my quest to make POP3/IMAP code more robust in the face of transient network errors [1][2][3][4][5].
And if possible, at least, I will try to make the messages more user-friendly in the future based on the detailed error codes.
(To my dismay, some error paths simply discarded error information and used
NS_ERROR_FAILURE/NS_ERROR_ABORT and thus user is deprived of any contextual information in the final error message. That lost contextual information could have been very useful in the subsequent user action.)  

yes, I have a feeling that EINTR is handled at lower code although I have a suspition
that it is not handled properly according to a comment from someone who is experiencing problems with a linux TB using a profile on a remote CIFS-mount.
So just making sure that EINTR is not passed up by mistake to the upper-layer code
(i.e., not hiding it under a generic NS_ERROR_FAILURE code)
is important here.

[1] Bug 1121842 [META] RFC: C-C Thunderbird - Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends.
[2] Bug 1122698	C-C Thunderbird - Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends. (Step-1)
[3] Bug 1134527 C-C Thunderbird - Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends. (Step-2)
[4] Bug 1134529	C-C Thunderbird - Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends. (Step-3)
I throw the latest one, here:
[5] Bug 1164283 C-C TB crashed in ftell timing out due to transient network error (user profile on remote file system) 

In all these cases, it was not clear what underlying error codes were (at least now I finally see EHOSTDOWN under linux translated to NS_* macro. I am expecting to see a few more during the local testing.
The use of ns-variant |Write|,|Read|, etc. and PR-variant |PR_Write|, |PR_Read|, etc. in a mixed manner is sure recipe for bugs and obscure error handling.
At least with my patch, users of ns-variant functions should be able to figure out low-level errors if they are passed up to the final application layer.

Re comprehensive approach:
I touched all error values for POSIX (or the values mapped to PR_* error macros, to be exact) so that I don't have to revisit this again maybe after one year or two, finding  I forgot to map ANOTHER error which is required in other error processing code. That is why the "comprehensive" approach now.

TIA
Comment on attachment 8604452 [details] [diff] [review]
More comprehensive mapping of errno to NS_* error macros

Review of attachment 8604452 [details] [diff] [review]:
-----------------------------------------------------------------

I looked through the bugs cited in comment 5 and didn't see any usage of the newly added NS_ERROR codes, so I'm a bit skeptical that they're actually needed.  We can discuss adding new error codes when they would actually be used.

There's a lot in this patch to keep track of, could you please split it into two parts?

Part 1 would be the comments in ErrorList.h for existing error code mappings.
Part 2 would be additional mappings (if any) for errno in nsLocalFile.h to currently-existing error codes in ErrorList.h, along with updating comments in ErrorList.h, if necessary.

I think the domerr.msg changes would be a third part, or even a different bug.

::: dom/base/domerr.msg
@@ +139,5 @@
>  DOM_MSG_DEF(NS_ERROR_FACTORY_EXISTS                , "Factory already exists")
> +DOM_MSG_DEF(NS_ERROR_DEADLOCK                      , "Deadlock would have occurred")
> +DOM_MSG_DEF(NS_ERROR_INTERRUPT                     , "Interrupted by a signal (EINTR)")
> +DOM_MSG_DEF(NS_ERROR_NO_MORE_LOCK                  , "System has run out of locks for files and records")
> +DOM_MSG_DEF(NS_ERROR_OVERFLOW                      , "Value is too large to store in a data type.")

I don't understand what these additions are for (nor the ramifications of adding these).  Can you explain why you added these in this particular patch (as opposed to a different bug)?

::: xpcom/base/ErrorList.h
@@ +41,2 @@
>    ERROR(NS_ERROR_NOT_AVAILABLE,                 0x80040111),
> +  /* Returned when a class is notr egistered */

Looks like this comment got changed inadvertently.
Attachment #8604452 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #6)
> Comment on attachment 8604452 [details] [diff] [review]
> More comprehensive mapping of errno to NS_* error macros
> 
> Review of attachment 8604452 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I looked through the bugs cited in comment 5 and didn't see any usage of the
> newly added NS_ERROR codes, so I'm a bit skeptical that they're actually
> needed.  We can discuss adding new error codes when they would actually be
> used.
> 

Thank you for taking the time to look through the patch.

To be honest, at this point, I have no idea whether the newly added NS_ERROR codes 
will show up in error code returned by POP3/IMAP error cases.
But some NEWLY MAPEED error codes definitely are.
And I have only recently realized the lack of mappings.
I did not use the values since they were not available during the time I created the patches mentioned in [1][2][3][4][5].
But now that, at least on local PC, they are available, I would use them in the patches.

As I I said,
> I touched all error values for POSIX (or the values mapped to PR_* error macros, to 
> be exact) so that I don't have to revisit this again maybe after one year or two,
> finding  I forgot to map ANOTHER error which is required in other error processing 
> code. That is why the "comprehensive" approach now.

This is because just getting the bland NS_ERROR_FAILURE/ABORT is not quite nice in terms of error handling, which had been the case in some errors as I noticed in my original comment.

Anyway, I will split the patch as suggested post new ones.

> There's a lot in this patch to keep track of, could you please split it into
> two parts?
> 
> Part 1 would be the comments in ErrorList.h for existing error code mappings.
> Part 2 would be additional mappings (if any) for errno in nsLocalFile.h to
> currently-existing error codes in ErrorList.h, along with updating comments
> in ErrorList.h, if necessary.
> 

Will do.

> I think the domerr.msg changes would be a third part, or even a different
> bug.
> 

> ::: dom/base/domerr.msg
> @@ +139,5 @@
> >  DOM_MSG_DEF(NS_ERROR_FACTORY_EXISTS                , "Factory already exists")
> > +DOM_MSG_DEF(NS_ERROR_DEADLOCK                      , "Deadlock would have occurred")
> > +DOM_MSG_DEF(NS_ERROR_INTERRUPT                     , "Interrupted by a signal (EINTR)")
> > +DOM_MSG_DEF(NS_ERROR_NO_MORE_LOCK                  , "System has run out of locks for files and records")
> > +DOM_MSG_DEF(NS_ERROR_OVERFLOW                      , "Value is too large to store in a data type.")
> 
> I don't understand what these additions are for (nor the ramifications of
> adding these).  Can you explain why you added these in this particular patch
> (as opposed to a different bug)?
> 

I simply followed the instruction in the existing
comment in ErrorList.h (see below), which says
that any change in DOM error codes need to be accompanied 
in a change to domerr.msg. 
It seems that the changes in these files must be paired.

http://mxr.mozilla.org/comm-central/source/mozilla/xpcom/base/ErrorList.h#469
--- begin quote ---
465   /* ======================================================================= */
466   /* 14: NS_ERROR_MODULE_DOM */
467   /* ======================================================================= */
468 #define MODULE NS_ERROR_MODULE_DOM
469   /* XXX If you add a new DOM error code, also add an error string to
470    * dom/base/domerr.msg */
471
--- begin quote ---

And as a matter of fact unless I do this,
an error trace from JavaScript showed unknown error-type of string
in the stack dump/error dump next to an error value.
But then I think I saw a few cases of error values which I did not touch, either, but I was not particularly paying attention to this issue at this moment.
Getting the proper unique error values for upper layer processing is my first priority. 

At least, I added the strings for the new symbols as the comment instructed.
That is all.


> ::: xpcom/base/ErrorList.h
> @@ +41,2 @@
> >    ERROR(NS_ERROR_NOT_AVAILABLE,                 0x80040111),
> > +  /* Returned when a class is notr egistered */
> 
> Looks like this comment got changed inadvertently.

Oops, may be a slip of finger. Thank you for spotting this.

TIA
> >  DOM_MSG_DEF(NS_ERROR_FACTORY_EXISTS                , "Factory already exists")
> > +DOM_MSG_DEF(NS_ERROR_DEADLOCK                      , "Deadlock would have occurred")
> > +DOM_MSG_DEF(NS_ERROR_INTERRUPT                     , "Interrupted by a signal (EINTR)")
> > +DOM_MSG_DEF(NS_ERROR_NO_MORE_LOCK                  , "System has run out of locks for files and records")
> > +DOM_MSG_DEF(NS_ERROR_OVERFLOW    

They are not strictly speaking DOM errors, but I felt they were needed since
there were some cases of missing translation to strings in stack dump.
I will check error logs to see what they were indeed.

TIA
Component: Untriaged → XPCOM
Flags: needinfo?(ishikawa)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: