Closed Bug 949078 Opened 11 years ago Closed 11 years ago

Conversion failures for the result of callback codegen are reported pretty weirdly

Categories

(Core :: DOM: Core & HTML, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla29

People

(Reporter: bzbarsky, Assigned: bzbarsky)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Your typical WebIDL callback has this general form, eliding some error-checking:

  SomeCppType
  Foo::MethodName(ErrorResult& aRv)
  {
    CallSetup s();
    JSContext* cx = s.GetContext();
    JS::Rooted<JS::Value> rval(cx, JS::UndefinedValue());
    // Do the getprop or call on mCallback
    // Convert rval to SomeCppType, using cx as the JSContext
  }

Now the interesting thing is what happens when rval cannot be converted to SomeCppType.  For example, say SomeCppType is a non-nullable interface type and rval is null (this came up today).  In that situation, we will do:

    ThrowErrorMessage(cx, MSG_NOT_OBJECT, "Return value");
    aRv.Throw(NS_ERROR_UNEXPECTED);
    return nullptr;

What actually happens when the CallSetup comes off the stack and tries to report the error on cx?  If script is not already running on cx we get an error report with no useful file/line info.  Not that happy.  :(

What _should_ the behavior be here? Let's consider separately the two kinds of callbacks we can have: JS-implemented webidl and normal callbacks.  

For the former, I would argue that we should in fact throw through to the caller.  We can implement that via replacing aRv.Throw(NS_ERROR_UNEXPECTED) above with getting the exn Value off cx and using aRv.ThrowJSException.  The benefit is that this gets thrown as a useful exception to the ultimate JS caller, and gets its line/file info, so when we hit this we can figure out what code is responsible.  The drawback is that the page gets an exception like "return value is not an object" for what is fundamentally a programming error in our API implementation; _it_ can't do anything to fix this exception.  If I could somehow leave the current setup with us reporting directly on cx but then immediately reporting the error, with the file/line of the _mCallback_ that would be awesome.  Except the callee is some random object, not a function.  :(

For the latter, I just have no idea.  Propagating the JS exception to some random C++ caller seems pretty questionable.  Maybe the answer is to just report things with no file/line info as needed...

Thoughts?
Bunch of WebIDL/codegen n00b questions, I'm afraid.

> Your typical WebIDL callback

Is EventListener::HandleEvent typical enough, for example?

http://dxr.mozilla.org/mozilla-central/source/dom/webidl/EventListener.webidl#13

> What _should_ the behavior be here? Let's consider separately the two kinds of callbacks we can have: JS-implemented webidl and normal callbacks.

I can't tell if you mean two different kinds of callbacks, or two different kinds of callers. I can't understand the post either way, so I'll assume callback.

TestSingleOperationCallbackInterface is the JS-implemented kind of callback, right? Then what exactly is a "normal callback"? JS-implemented callbacks are the only kind I know about.

> For the latter, I just have no idea.  Propagating the JS exception to some random C++ caller seems pretty questionable.  Maybe the answer is to just report things with no file/line info as needed...

I'm still unsure what case we're talking about here but if the exception is a JS Error (or subclass) then it'll have location information attached. Here are some things we could do:

- make it easier to get that information out of a (possibly wrapped) Error object, without risking reentering user code or other craziness
- make the JS engine capture location information for other exceptions too (it would live for as long as the exception is pending; then it would be thrown away)

(I'm looking for error-handling-related tasks, to triage them and spend some small number of days just knocking out the important ones, so this is a lucky time to be annoyed with our current behavior.)
General thoughts:

  - JS-to-JS and C++-to-C++ calls are not the problem ... right?

  - When we call from JS into C++, and C++ blew it, the contract is *almost* easy: C++ mostly just has to report an error and return false. There are cases where calling JS_ReportError doesn't set a pending exception (or tries to set one and fails), even though we are calling from the JS engine; but I really think we should just get rid of those.

    The protocol should be easier to understand and implement correctly. We should be able to assert that:

      * if you're making a call from JS into C++, there is no exception pending

      * if you're returning an error code from C++ to JS, there is an exception pending

      * if you're returning successfully from C++ to JS, there is no exception pending

    and the basic APIs like JS_ReportError should help you meet those requirements.

  - When we call from C++ into JS, it's a mess. The solution is the same though: hammer out a contract that's simple enough to have some confidence in...

  - The answers should be the same whether the C++ side is XPConnect or WebIDL bindings. We should establish a single contract and work towards both XPConnect and the bindings adhering to it.
Jason, sorry for the lack of background....

> Is EventListener::HandleEvent typical enough, for example?

Yes, for the "not JS-implemented" case.

> I can't tell if you mean two different kinds of callbacks, or two different kinds of
> callers.

The former.  The way we implement JS-implemented WebIDL APIs is by codegenning some JSNatives which call into a C++ class we also codegen, which then calls into basically a WebIDL callback implementation we also codegen, which wraps a chrome JSObject which implements the actual functionality.

So the normal case of calling a JS-implemented API involves something that acts a lot like a WebIDL callback in happening as part of the call.

In all cases the code I'm worried about is involved in making a call from C++ into JS.  Either from Gecko into web page JS (the "normal callback" case) or from binding C++ into chrome JS (the "JS-implemented API" case).

> TestSingleOperationCallbackInterface is the JS-implemented kind of callback, right? 

It's also the normal kind of callback, for purposes of the discussion above.

An example of the JS-implemented API case is something like RTCStatsReport.webidl or anything else with a "JSImplementation" annotation.  Or just TestJSImplGen.webidl, which should be pretty exhaustive nowadays.

> I'm still unsure what case we're talking about here but if the exception is a JS Error
> (or subclass) then it'll have location information attached.

In that case I was talking about the "Gecko C++ calls into web page JS" case.  The exception in this case is created via JS_ReportErrorNumberVA.  So it's an Error subclass, I'm pretty sure, but at the point when JS_ReportErrorNumberVA is called the JSContext it's called on may not have any JS running on it.  And if it has no JS running on it, where would the location information come from?

> - make the JS engine capture location information for other exceptions too 

We should definitely do this, no matter what!

>  - JS-to-JS and C++-to-C++ calls are not the problem ... right?

Those would generally not involve binding code, right.  Pretty much everything under discussion here is C++-to-JS or JS-to-C++-to-other-JS.

> When we call from JS into C++, and C++ blew it, the contract is *almost* easy: C++
> mostly just has to report an error and return false.

Right.  We do that correctly for the most part.

This bug is about the other direction: C++ calls into JS, JS returns something, C++ sees it got something unexpected and needs to somehow communicate to some JS error reporter that the JS screwed up.

> The solution is the same though: hammer out a contract that's simple enough to have some
> confidence in..

We sort of have that.  Or rather several slightly different contracts, represented by the ExceptionHandling enum in mozilla::dom::CallbackObject.  The eReportExceptions and eRethrowExceptions contracts are required for web compat (in that some specs want one and some want the other).

The eRethrowContentExceptions contract is how JS-implemented webidl interfaces (the ones with a chrome JS implementation) make the call into chrome.  This is a content-JS-to-binding-C++-to-chrome-JS call situation.  This contract is designed to not accidentally leak chrome exceptions (which might contain privileged information) into content but still allow chrome to throw the exceptions some spec requires the API to throw and have that exception end up in content with the file/line info of the content call into the webidl code...  What that means is that we sometimes report an exception twice here: once to the chrome code with the full chrome line/file info and whatnot, and once to content as a generic NS_ERROR_FAILURE (with the line/file info of the content caller into C++).

I agree we should fix XPConnect too.  Right now the problem is that XPConnect has no way to tell which of those three contracts it's supposed to be implementing, because the C++ view of XPConnect calls that will end up in JS looks like any other C++ interface.  With WebIDL we deliberately decided to not maintain that fiction, precisely because it lets us do things like explicitly specify what should happen with the exceptions...

We're pretty close to XPConnect never being used to reflect web page objects into C++.  At that point it will only be used for chrome JS objects being reflected into C++.  That _might_ make it easier to just have XPConnect pick one of those three contracts and stick to it.  Maybe.
(In reply to Boris Zbarsky [:bz] from comment #0)

> If I could somehow leave the current setup with us reporting
> directly on cx but then immediately reporting the error, with the file/line
> of the _mCallback_ that would be awesome.

The eventual model I've been working towards is the following:

* When an error is thrown, the stack that's generated examines the entire stack of the JSRuntime, but filters each frame with a subsumes check by the current compartment.
* When an exception bubbles up to a script entry point, it is unconditionally reported on that global, using a hook on nsIGlobalObject.

Assuming that we plan to continue marking callbacks as entry points, the consistent approach here would be to report the exception when the CallSetup comes off the stack. If there are cases (like JS-implemented WebIDL) where we don't want to do that, I would make the argument that those should (eventually) not be marked as entry points.

> Except the callee is some random
> object, not a function.  :(
> 
> For the latter, I just have no idea.  Propagating the JS exception to some
> random C++ caller seems pretty questionable.  Maybe the answer is to just
> report things with no file/line info as needed...

Well, there _is_ a callable in there somewhere, right? It seems like it's just an issue of getting the right information in the right place. I think this is definitely the file/line information that we should be reporting.
> where would the location information come from?

Let me do a more concrete example of what prompted this bug to be filed.  Say we have this API:

  [JSImplementation="whatever-contract-id"]
  interface Foo {
    readonly attribute Document doc;
  }

This API has a single property, named "doc" and the return value is guaranteed to be a non-null Document object if the get doesn't throw.

Now say the chrome object implementing this API is defined as:

  { doc: null }

because the person coding it screwed up.

Now the page does .doc on the object it has.  This lands us in a JSNative, with cx set to the JSContext of the content page.  That JSNative calls some C++ code that says "hey, I need to call into chrome JS".  It does NOT pass through the content page JSContext.  The C++ code looks up the global of the chrome object it has, finds a reasonable JSContext for that global (e.g. if it's a window, uses its associated JSContext).  Then the C++ code does a JS_GetProperty on the chrome object.  All good so far.

Now the C++ code examines the value it got and sees it got null.  So it needs to throw an exception.  Which JSContext should the exception be thrown on, and what location information should it have?
> * When an error is thrown, the stack that's generated examines the entire stack of the
> JSRuntime, but filters each frame with a subsumes check by the current compartment.

It's not just the stack, fwiw.  Exceptions have direct file/line/column members that script can examine, as well as the exception text.

But yes, something like this approach might eventually get us what we want; interactions with devtools would need to be though through.

> report the exception when the CallSetup comes off the stack.

That's what we do right now in the eReportExceptions case.

> I would make the argument that those should (eventually) not be marked as entry points.

That's a spec issue, really.  Should the filter for a treewalker be an entry point?

> Well, there _is_ a callable in there somewhere, right?

No, not necessarily.  See my example in comment 5.
(In reply to Boris Zbarsky [:bz] from comment #6)
> > * When an error is thrown, the stack that's generated examines the entire stack of the
> > JSRuntime, but filters each frame with a subsumes check by the current compartment.
> 
> It's not just the stack, fwiw.  Exceptions have direct file/line/column
> members that script can examine, as well as the exception text.

Well, there are two issues - one is the stack captured by the code that generated the exception. The other is the view that someone gets when a possibly-not-same-origin bubbles up to them on the stack. The code that handles those two things is currently separate. We could make the stack filtering happen lazily (i.e. store the whole stack, and filter it when it's accessed), but then we have to be really careful.
(In reply to Boris Zbarsky [:bz] from comment #6)
> > Well, there _is_ a callable in there somewhere, right?
> 
> No, not necessarily.  See my example in comment 5.

Oh, that's an interesting example that I hadn't considered.

In the case where it's purely a conversion error (and there were no preceding exceptions thrown by script), we should probably report some an error with as much as we can, which would be the following:
* The interface name
* The name of the property being accessed
* The reason the conversion failed
* The the Location from the compartment private of the object's global (this is the same information that about:memory uses).
> In the case where it's purely a conversion error (and there were no preceding exceptions
> thrown by script)

That's the case this bug is about, yes.  We already handle all exceptions thrown by actual scripts in reasonable-enough way, I think.  And in particular, they all have location info.

> we should probably report some an error with as much as we can, which would be the
> following

Hmm.  We can definitely add things like the interface name and property being accessed; that's straightforward and a really good idea.  The reason the conversion failed is already being reported.

The uri from the compartment private, hmm.  I guess since this is fundamentally an error in the callee, that might make some sense....
Getting the compartment URI in there is actually kinda hard, but I think just having the API involved identified is actually exactly what we want here.  In particular, it would allow immediately locating the callee via MXR search, which is all the URI would tell you anyway.
Patch coming up that for this callback IDL hacked into ContactManager:

  readonly attribute Node foobar;

makes the codegen look like so:

  if (rval.isObject()) {
    {
      nsresult rv = UnwrapObject<prototypes::id::Node, nsINode>(&rval.toObject(), rvalDecl);
      if (NS_FAILED(rv)) {
        ThrowErrorMessage(cx, MSG_DOES_NOT_IMPLEMENT_INTERFACE, "Return value of ContactManager.foopy", "Node");
        aRv.Throw(NS_ERROR_UNEXPECTED);
        return nullptr;
      }
    }
  } else {
    ThrowErrorMessage(cx, MSG_NOT_OBJECT, "Return value of ContactManager.foopy");
    aRv.Throw(NS_ERROR_UNEXPECTED);
    return nullptr;
  }

and for this IDL:

  callback SomeCallback = object();

makes the codegen look like so:

  if (rval.isObject()) {
    rvalDecl = &rval.toObject();
  } else {
    ThrowErrorMessage(cx, MSG_NOT_OBJECT, "Return value of SomeCallback");
    aRv.Throw(NS_ERROR_UNEXPECTED);
    return nullptr;
  }
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
(In reply to Boris Zbarsky [:bz] from comment #0)
> For the latter, I just have no idea.  Propagating the JS exception to some
> random C++ caller seems pretty questionable.  Maybe the answer is to just
> report things with no file/line info as needed...
> 
> Thoughts?

Could it make sense to have assertions that would make debug build crash? That way we can easily see where it happens thanks to the message, and we can gdb into it quite easily.
> Could it make sense to have assertions that would make debug build crash? 

Based on a random value received from untrusted JS?  I don't think so, sadly.
Comment on attachment 8346357 [details] [diff] [review]
Provide information about which return value we're talking about when throwing a conversion failure exception for the return value of a call into a WebIDL callback.

Review of attachment 8346357 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/bindings/Codegen.py
@@ +11174,2 @@
>          self.ensureASCIIName(attr)
> +        self.attr = attr

Don't need this

@@ +11179,5 @@
>                                  rethrowContentException=descriptor.interface.isJSImplemented())
>  
> +    def getPrettyName(self):
> +        return "%s.%s" % (self.descriptorProvider.interface.identifier.name,
> +                          self.attr.identifier.name)

if you just use self.attrName here.
Attachment #8346357 - Flags: review?(peterv) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/5560bc3f0bd0 with that change.
Flags: in-testsuite?
Target Milestone: --- → mozilla29
https://hg.mozilla.org/mozilla-central/rev/5560bc3f0bd0
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: