Closed Bug 599791 Opened 14 years ago Closed 13 years ago

CClosure::ClosureStub silently drops exceptions

Categories

(Core :: js-ctypes, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla10
Tracking Status
blocking2.0 --- -

People

(Reporter: gal, Assigned: bholley)

References

Details

Attachments

(7 files, 6 obsolete files)

2.24 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
3.79 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
4.70 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
1.73 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
3.77 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
3.27 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
4.93 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
Looks like ClosureStub silently ignores exceptions. Bad.
Hmm. What else can we do here? It'll take a reported error. Is there a way to cause a reported error to be thrown as an exception other than returning JS_FALSE back to JSAPI?
JS API users must propagate failure, or else report the pending exception if there is one. See these APIs:

JS_IsExceptionPending(JSContext *cx);
JS_GetPendingException(JSContext *cx, jsval *vp);
JS_SetPendingException(JSContext *cx, jsval v);
JS_ClearPendingException(JSContext *cx);
JS_ReportPendingException(JSContext *cx);

Note that Report clears Clears.

/be
Ah, I see. The relationship between an error and an exception was unclear to me.

It sounds like the only thing we need to do here is JS_ReportPendingException if JS_CallFunctionValue fails. And we don't need to worry about reporting an exception from other JSAPI methods that fail, because they've already reported.

It occurs to me that we should probably abort if a JSAPI method fails or a script exception is raised. We really have no way of propagating failure to the caller of the closure, which means program execution is unpredictable. (We do zero out the return value, but that doesn't mean much for safety.)

There are a few conditions that can cause failure in ClosureStub:
1) OOM. Aborting is reasonable here.
2) ConvertToJS failure, e.g. if one of the native args cannot be converted to a JS datatype. The only meaningful way this can fail is OOM.
3) The script function itself throws an exception.
4) ImplicitConvert failure, e.g. if the returned primitive cannot implicitly be converted to a native value of the specified type. This would occur if the JS function returned, say, a string primitive instead of a ctypes.char.ptr.

Reporting an error seems nicer, since it gives people more information to fix the bug, but losing safety is not good. :(
Spoke with jorendorff and decided to report & JS_Abort.
Assignee: nobody → dwitte
Attached patch report & abort (obsolete) — Splinter Review
Simple enough.
Attachment #478907 - Flags: review?(gal)
Attached patch bonus fix (obsolete) — Splinter Review
JS_Abort() is unimplemented, resulting in a link error if one tries to use it.
Attachment #478908 - Flags: review?(gal)
Comment on attachment 478907 [details] [diff] [review]
report & abort

Not a big fan of the random abort() call. Is that guaranteed to fail on all obscure platforms? Looks good otherwise.
Attachment #478907 - Flags: review?(gal) → review+
It's what we use all through jstracer.cpp, so I imagine it's fairly universal.
(In reply to comment #3)
> Ah, I see. The relationship between an error and an exception was unclear to
> me.
> 
> It sounds like the only thing we need to do here is JS_ReportPendingException
> if JS_CallFunctionValue fails. And we don't need to worry about reporting an
> exception from other JSAPI methods that fail, because they've already reported.

What ensures they've already reported?

Without JSOPTION_DONT_REPORT_UNCAUGHT, you get an automatic report only when the failing API is returning with no JS code active on cx. Why would that differ for the JS_CallFunctionValue API call in this code from other JS API calls in the same code?

> It occurs to me that we should probably abort if a JSAPI method fails or a
> script exception is raised. We really have no way of propagating failure to the
> caller of the closure, which means program execution is unpredictable. (We do
> zero out the return value, but that doesn't mean much for safety.)

This is our new binary extension model? It's going to be a nightmare even if you ignore OOM. JS can throw. The C code has to have a failure model other than whole-process abort.

> 3) The script function itself throws an exception.
> 4) ImplicitConvert failure, e.g. if the returned primitive cannot implicitly be
> converted to a native value of the specified type. This would occur if the JS
> function returned, say, a string primitive instead of a ctypes.char.ptr.
> 
> Reporting an error seems nicer, since it gives people more information to fix
> the bug, but losing safety is not good. :(

Why can't safety be assured by setting all out params and return values to zero-ish values appropriate for the types, and leaving something like errno set to indicate the JS error?

/be
(In reply to comment #8)
> It's what we use all through jstracer.cpp, so I imagine it's fairly universal.

Argh! This is how bad patterns propagate. That was a short term hack and FIXME comments in the code are supposed to point to the open bug on eliminating those aborts.

OOM happens, it can happen not just with arbitrarily large allocation requests. As Chris Jones pointed out, if you aren't careful about memory pressure monitoring you can walk into it for small fixed-size allocations, even if a GC would free a lot of space.

Do not imitate those aborts as "good precedent." They should go away with the unified tracer/method-JIT stack coming after Fx4.

/be
blocking2.0: --- → ?
I like the errno idea.
errno is also a classically bad pattern.  Is there no more local piece of context that can convey the error from callee to caller?
Modern errno is of course thread-local storage. It's hard to avoid falling into this design gravity-well with ctypes -- the C calls JS without a cx issue.

Same as XPConnect, but it has well-known singletons to pollute with pigeon-holes (see Components.lastResult and Components.returnCode).

So Luke, I agree errno is nasty but there isn't a substantially-different alternative unless we mandate or automate a cx-like parameter for all functions called via jsctypes.

/be
(In reply to comment #9)
> What ensures they've already reported?

Wait, I thought it was convention for JSAPI functions to report and return JS_FALSE on failure? At least, for the JSAPI functions in question.

> This is our new binary extension model? It's going to be a nightmare even if
> you ignore OOM. JS can throw. The C code has to have a failure model other than
> whole-process abort.

Speaking of OOM alone, I agree, it kinda sucks -- we could be dealing with arbitrarily large user-defined data.

> Why can't safety be assured by setting all out params and return values to
> zero-ish values appropriate for the types, and leaving something like errno set
> to indicate the JS error?

How would we know it's an outparam that can safely be zeroed?

  char* foo = new char[20];
  memcpy(foo, "hi", 3);
  int rv = call_my_closure(&foo);
  if (rv == 0)
    char c = foo[0];

In that example, nulling 'foo' and the return value inside call_my_closure will a) leak the originally assigned 'foo' b) erroneously report success to the caller c) crash on the dereference.

I suppose you're arguing that unexpected behavior is better than crashing. But how do we know that's true? By assuming that the caller is sensible about checking not just return value, but also outparam values?

If we're serious about handling exceptions from script, then we need an API for the ctypes consumer to say "on exception, this is the value you should return". (For instance, -1.) But that's not a very complete way of doing it. (An exception handler function in script might be better, especially if we require the ctypes consumer provide one!)

Disregarding OOM, the only way we can fail here is if the ctypes consumer doesn't try/catch inside their closure, or they return an unconvertable value. These are both fairly easy things to do, as long as they know to do them. I agree, though, it will be yet another trap one can easily fall into. :(
(In reply to comment #14)
> (In reply to comment #9)
> > What ensures they've already reported?
> 
> Wait, I thought it was convention for JSAPI functions to report and return
> JS_FALSE on failure? At least, for the JSAPI functions in question.

No. See comment 9 second paragraph from me.

> > This is our new binary extension model? It's going to be a nightmare even if
> > you ignore OOM. JS can throw. The C code has to have a failure model other than
> > whole-process abort.
> 
> Speaking of OOM alone, I agree, it kinda sucks -- we could be dealing with
> arbitrarily large user-defined data.

This isn't a credible replacement for XPConnect, or a good FFI for JS.

> > Why can't safety be assured by setting all out params and return values to
> > zero-ish values appropriate for the types, and leaving something like errno set
> > to indicate the JS error?
> 
> How would we know it's an outparam that can safely be zeroed?

I was assuming you had "direction" (in, inout, out) information along with type information.

>   char* foo = new char[20];
>   memcpy(foo, "hi", 3);
>   int rv = call_my_closure(&foo);
>   if (rv == 0)
>     char c = foo[0];
> 
> In that example, nulling 'foo' and the return value inside call_my_closure will
> a) leak the originally assigned 'foo' b) erroneously report success to the
> caller c) crash on the dereference.

That's an inout param, have to leave it alone on error.

Do you have direction as well as type for each parameter?

/be
(In reply to comment #15)
> Do you have direction as well as type for each parameter?

Aha! Now I'm with you. We do not have that information. We would have to specialcase closure declarations to make it work, which is fine. The question is whether it's worth doing. If we're serious about being robust, and supporting generic function signatures, we have to.

The information we would need is
1) The return value to use on failure
2) The value to set each argument to, for out/inout params, or possibly just assume NULL -- dangerous if the function doesn't have a return value that can indicate failure.

The provided values would have to be converted at closure declaration time, so we can throw if they're unconvertable or we run out of memory trying to convert them.

Thoughts?
I think I have a happy medium here that a) gets us almost all the way to safety, b) does not require a major API change for closures.

We can add an optional arg to the closure constructor:

  let closure = ctypes.FunctionType(...).ptr(my_js_fn, -1);

where -1 is the return value to pass on failure. (This would be eagerly converted on the spot, and throw if it can't be.) We could make this a required arg, but I don't feel strongly about that.

Then, we strongly recommend in the docs and examples that the first thing people do in their js function is set outparams to sensible values, i.e. NULL where appropriate -- nothing fancy that would require potentially big allocations. Alternatively, they catch all exceptions that could be thrown inside the function.

If the js function throws proper, we hope they've supplied a default return value and return it, or return zeroed data if not (or abort?).

This should cover the common case where C callback functions have a return value that can indicate success or failure, such as a result code or a pointer.

If the above conditions are satisfied, the only way we can fail is if we OOM performing a few small allocations for the args. Or if a default return value was not supplied, and the script throws, then we're back to either aborting or being unsafe.

Is anyone happy with that?
I'd note that aborting on small allocation failure is exactly what infallible malloc does, and will continue to do. Whether this will fly in jseng in this particular case is a different question, however. (We could get all fancy and preallocate a little bit of memory that we can free if we OOM in the closure, enough to either complete the allocation or perform a last-ditch GC. Nasty!)
(In reply to comment #18)
> I'd note that aborting on small allocation failure is exactly what infallible
> malloc does, and will continue to do.

This is tenable only if you monitor OS memory pressure and try a GC before failing.

> Whether this will fly in jseng in this
> particular case is a different question, however.

The malloc heap is shared among all the components, not just the JS engine, and the VM address space too (from which the GC virtual-allocs and commits).

E10s brings the needed monitoring and GC/cache-flushing, last I heard, but it is not there yet and it will introduce bugs and bogus failures to rely in what is not there yet.

/be
Someone should contrast comment 17 with what XPConnect does.

/be
As much as this is a bug, I don't think this blocks.
blocking2.0: ? → -
(In reply to comment #20)
> Someone should contrast comment 17 with what XPConnect does.

Peterv, Johnny, Blake?

/be
Reassigning to nobody. If anyone wants to work on this, feel free!
Assignee: dwitte → nobody
Taking.
Assignee: nobody → bobbyholley+bmo
I've spun up a patch stack to implement dwitte's strategy in comment 17. The patches here apply on top of the ones in the dependent bug.

Flagging dwitte for review. This will need sr as well.
Depends on: 682504
Attachment #556225 - Flags: review?(dwitte)
Attached patch part 7 - Tests. v1 (obsolete) — Splinter Review
Attachment #556228 - Flags: review?(dwitte)
Rebasing the tests off of the updated patch from bug 682504.
Attachment #556228 - Attachment is obsolete: true
Attachment #556644 - Flags: review?(dwitte)
Attachment #556228 - Flags: review?(dwitte)
Attachment #556222 - Flags: review?(dwitte) → review?(jorendorff)
Attachment #556223 - Flags: review?(dwitte) → review?(jorendorff)
Attachment #556224 - Flags: review?(dwitte) → review?(jorendorff)
Attachment #556225 - Flags: review?(dwitte) → review?(jorendorff)
Attachment #556226 - Flags: review?(dwitte) → review?(jorendorff)
Attachment #556227 - Flags: review?(dwitte) → review?(jorendorff)
Attachment #556644 - Flags: review?(dwitte) → review?(jorendorff)
Over to jorendorff, per our discussion yesterday evening.

Jorendorff - the patches are nicely divided, so they shouldn't be too hard to review. Let me know if there's anything I can do to make it easier.
Attachment #556222 - Flags: review?(jorendorff) → review+
Comment on attachment 556223 [details] [diff] [review]
part 2 - restructure argument handling in PointerType::ConstructData. v1

Sorry for the slow reviews. So far so good; I will get the rest of these to you on Tuesday.
Attachment #556223 - Flags: review?(jorendorff) → review+
*gentle whine* ;-)
Attachment #556224 - Flags: review?(jorendorff) → review+
Attachment #556225 - Flags: review?(jorendorff) → review+
Comment on attachment 556226 [details] [diff] [review]
part 5 - Prepare the sentinel value and store it in ClosureInfo. v1

>+  // the caller to know about it _know_, rather than some uncertain time in the

know about it _now_

+    // Make sure the callback returns something.
+    if (CType::GetTypeCode(cx, fninfo->mReturnType) == TYPE_void_t) {
+      JS_ReportError(cx, "A void callback can't pass an error sentinel");
+      return NULL;
+    }

This returns before cinfo->errResult is initialized, right? Then the
destructor is called. It seems bad.

Please add a test for this error path.

>+    cinfo->errResult = cx->malloc_(rvSize);
>+    if (!cinfo->errResult) {
>+      JS_ReportOutOfMemory(cx);
>+      return NULL;
>+    }

cx->malloc_ will already have called ReportOutOfMemory on failure, so
you can remove that line.

>   ~ClosureInfo() {
>     if (closure)
>       ffi_closure_free(closure);
>+    if (errResult)
>+      js_free(errResult);
>   };

I'm told that calling js_free directly isn't allowed. (We have a comment
that says it is allowed, but we're getting rid of it, and make check
will actually detect that you're calling js_free and gripe at
you.)

Instead, store a JSRuntime pointer in the ClosureInfo and have the
destructor call rt->free_(errResult). (It would be even better to use
cx, but I imagine there's some reason that can't be done.)

It is OK to mix cx->malloc_ with rt->free_.

There are only two hard things in computer science: memory management, and I forget the other one.
Attachment #556226 - Flags: review?(jorendorff) → review+
Comment on attachment 556227 [details] [diff] [review]
part 6 - Return the sentinel when we fail in ClosureStub. v1

This code should not leave an exception in cx.

Here's what I would change:

>   JSBool success = JS_CallFunctionValue(cx, thisObj, OBJECT_TO_JSVAL(jsfnObj),
>                                         cif->nargs, argv.begin(), &rval);
> 
>-  // If the callback doesn't return anything, we're done.
>-  if (cif->rtype == &ffi_type_void)
>-    return;
>-
>   // Convert the result. Note that we pass 'isArgument = false', such that
>   // ImplicitConvert will *not* autoconvert a JS string into a pointer-to-char
>   // type, which would require an allocation that we can't track. The JS
>   // function must perform this conversion itself and return a PointerType
>   // CData; thusly, the burden of freeing the data is left to the user.
>-  if (success)
>+  if (success && cif->rtype != &ffi_type_void)
>     success = ImplicitConvert(cx, rval, fninfo->mReturnType, result, false, NULL);
> 
>   // Something might have failed. The callee may have thrown, or it may not have
>   // returned a value that ImplicitConvert() was happy with. Depending on how
>   // prudent the consumer has been, we may or may not have a recovery plan.
>   if (!success) {
>+    // In any case, a JS exception cannot be passed to C code, so
>+    // report the exception if any and clear it from the cx.
>+    if (JS_IsExceptionPending(cx))
>+      JS_ReportPendingException(cx);
> 
>     // Good case: we have a sentinel that we can return. Copy it in place of the
>     // actual return value, and then proceed.
>     if (cinfo->errResult) {
> 
>       // The buffer we're returning might be larger than the size of the return
>       // type, due to libffi alignment issues (see above). But it should never
>       // be smaller.
>       size_t copySize = CType::GetSize(cx, fninfo->mReturnType);
>       JS_ASSERT(copySize <= rvSize);
>       memcpy(result, cinfo->errResult, copySize);
>-    }
>-
>-    // Bad case: not much we can do here. The rv is already zeroed out, so we
>-    // just report an error and hope for the best.
>-    else {
>+    } else {
>+      // Bad case: not much we can do here. The rv is already zeroed out, and
>+      // the error has already been reported, so just return and hope for the
>+      // best.
>-      JS_ReportError(cx, "JavaScript callback failed, and an error sentinel "
>-                         "was not specified.");
>       return;
>     }
>   }

r=me with those changes.

The change to the comment formatting is just to be consistent with the rest of the file. CTypes.cpp is a little quirky.
Attachment #556227 - Flags: review?(jorendorff) → review+
Comment on attachment 556644 [details] [diff] [review]
part 7 - Tests. v2

As I mentioned in reviewing part 5, it would be nice to have one or two more tests of those error cases.

Actually, if it's just a couple lines of JS code, please add a test for this feature when the return type is bigger than a word. (Do we support that? If not, a test is still in order, just to check that we don't crash or assert.) It isn't super crucial; if it would take more than 20 minutes, don't worry about it.

Looks good. r=me.

I apologize for the slow review here.
Attachment #556644 - Flags: review?(jorendorff) → review+
Attachment #556225 - Attachment is obsolete: true
Attachment #564999 - Flags: review?(jorendorff)
Attachment #556226 - Attachment is obsolete: true
Attachment #565000 - Flags: review?(jorendorff)
Added updated patches for 4, 5, and 6. Reflagging jorendorff for review, despite the r+.
Attachment #556227 - Attachment is obsolete: true
Attachment #565001 - Flags: review?(jorendorff)
Attachment #564999 - Flags: review?(jorendorff) → review+
Comment on attachment 565000 [details] [diff] [review]
part 5 - Prepare the sentinel value and store it in ClosureInfo. v2

>+  }
>+  else
>+    cinfo->errResult = NULL;

Style nit:

+  } else {
+    cinfo->errResult = NULL;
+  }
Attachment #565000 - Flags: review?(jorendorff) → review+
Comment on attachment 565001 [details] [diff] [review]
part 6 - Return the sentinel when we fail in ClosureStub. v2

>+    }
>+
>+    else {

Same style nit again. Like this:

+    } else {

r=me with that.
Attachment #565001 - Flags: review?(jorendorff) → review+
Attachment #478907 - Attachment is obsolete: true
Attachment #478908 - Attachment is obsolete: true
Attachment #478908 - Flags: review?(gal)
Pushed to mozilla-inbound: http://hg.mozilla.org/integration/mozilla-inbound/rev/4e1565df0b2c (and ancestors)

This had a green try run, modulo known oranges: https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=e71f6e3a3771
Status: NEW → ASSIGNED
Flags: in-testsuite+
Target Milestone: --- → mozilla10
Keywords: dev-doc-needed
I just added some basic documentation for this: https://developer.mozilla.org/en/js-ctypes/js-ctypes_reference/Callbacks
Keywords: dev-doc-needed
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: