Closed Bug 978618 Opened 6 years ago Closed 6 years ago

Filename/line number missing when WebRTC implementation has a JS error

Categories

(Core :: DOM: Core & HTML, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: jruderman, Assigned: bzbarsky)

References

(Blocks 1 open bug)

Details

(Keywords: testcase, Whiteboard: [fuzzblocker])

Attachments

(1 file)

See bug 978617 for an example. This makes it hard for me to properly ignore known bugs.
Byron, I think this is coming from your code (as is bug 978617).  If I'm wrong, can you reassign this to jib?  (I'm doing the same for bug 978617.)   I'd love to get this fixed soon (especially if it's a quick fix.)
Assignee: nobody → docfaraday
(In reply to Jesse Ruderman from Bug 978617 comment #0)
> System JS : ERROR (null):0 - Permission denied to access property 'toString'
> System JS : ERROR (null):0 - Permission denied to access property 'message'
> System JS : ERROR (null):0 - uncaught exception: unknown (can't convert to
> string)
> 
> JavaScript error: file:///Users/jruderman/Desktop/b.html, line 13:
> NS_ERROR_UNEXPECTED:
> 
> (Also, why are the filenames and line numbers missing?)

Fwiw on FF26 or newer I see the same as Jesse, while on FF25 I see this instead (in Browser console):

NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [IPeerConnection.initialize] @ resource://gre/components/PeerConnection.js:327
NS_ERROR_UNEXPECTED: Unexpected error @ https://bug978617.bugzilla.mozilla.org/attachment.cgi?id=8384333:13

(note the file and line numbers) effectively making this a regression I suppose.

However, in FF25 the code failed in c++, while in FF26 we appear to have added JS code that somehow trips over over Jesse's case and throws an exception in our js code (a js "crash" in our code if you will). If this doesn't show file + line-numbers that seems to be an environment problem or decision. Not sure how we control that part.
I still see this problem when I hard-code the C++ CreateOffer() to return a failure. What we are seeing here is some sort of problem propagating C++ errors back to JS, with possibly some command queuing to keep things interesting.
Ok, have tried adding a try/catch around the call to createOffer() in PeerConnection.js, and thrown a new exception based on toString() of the old one, and that propagates fine. I suspect that the original exception is a chrome object and inaccessible to content, or something similarly bizarre. Will experiment some more.
The exception still does not propagate if I remove the command queuing.
createOffer is not involved here. The exception is in the mozPeerConnection constructor
I have sabotaged the c++ impl of CreateOffer() to make experimentation faster.
Nevermind, I see it too now. Yeah that is a regression for sure.
I'm trying to remove the [ChromeOnly] from the PeerConnectionImpl interface and see if that makes any difference. It did not on an incremental build, but I'm doing a clobber build just in case.
Nope. Makes no difference. Maybe we're just hitting a corner case by having a JS impl webidl API call into a different c++ impl webidl API, and expecting errors in the latter to propagate through the former all the way to content?
Got any hypotheses here?
Flags: needinfo?(bzbarsky)
I should point out that, even if we can get things to propagate correctly here, we are still keeping a command queue in PeerConnection.js, so if a command ends up queued, there will be nowhere for exceptions to land.
The queue exists primarily to protect our internal state machine from bad user code, and well-functioning code tends not to rely on it, so lets leave the queue case alone for now.

We used to propagate correctly in the reported case and something broke in FF26. Oddly, my prime suspect, the PeerConnectionImpl conversion to webidl (Bug 917328) didn't land till 27, so something else broke us.
So there are a few things happening here.  Starting with the last one first, this part:

> JavaScript error: file:///Users/jruderman/Desktop/b.html, line 13: NS_ERROR_UNEXPECTED:

is what happens if chrome JS implementing a WebIDL interface accidentally throws an exception that it didn't mean to throw.  We make a point, in the WebIDL bindings, of not propagating out privileged information about where exactly the exception was thrown and such.  We just make it look like a generic NS_ERROR_UNEXPECTED thrown on the line of the script where the call into JS-implemented webidl was made by the web page.

In this case, the unintentional exception is being thrown on line 404 of PeerConnection.js, for what that's worth.

The other thing is this bit:

>System JS : ERROR (null):0 - Permission denied to access property 'toString'
>System JS : ERROR (null):0 - Permission denied to access property 'message'
>System JS : ERROR (null):0 - uncaught exception: unknown (can't convert to string)

This is more subtle.  We get this because we try to report the chrome exception that got thrown to the console, in addition to throwing the sanitized exception to the web content.  So we call JS_ReportPendingException via nsJSUtils::ReportPendingException in ~CallSetup.  This then goes and does js::ToString, which ends up trying to do a get on a proxy, which does AutoEnterPolicy, which presumably fails, so we land in reportErrorIfExceptionIsNotPending which reports an error like "Permission denied to access property 'toString'".  Similarly for the 'message' property we're getting a security exception from IsDuckTypedErrorObject().

All of these have a null filename and no line number because they're happening when there is no script on the stack on the JSContext they're happening on.

And the reason we can't access that exception object is that in nsJSUtils::ReportPendingException we do a JS_SaveFrameChain, which blows away the current compartment on the context.  Then we enter effectively the default compartment of the JSContext, as far as I can see.  But for this case, calling into a JS component, we're using the safe context, and so its default compartment isn't allowed to touch the privileged exception object.

Bobby, has any of that changed recently?  I was pretty sure we'd tested that we got correct reporting to the console for chrome-side exceptions in JS-implemented webidl, but I totally don't see how it could have possibly worked given that JS_SaveFrameChain behavior.
Flags: needinfo?(bzbarsky) → needinfo?(bobbyholley)
So the regression from the output in comment 2 to what we have today happened in this nightly range:

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c7cc85e13f7a&tochange=e5ca10a2b3d0

so presumably a regression from bug 911258?  But I still don't understand how it could possibly ever have worked...
So in today's world, what ends up throwing those security exceptions is xpc::FilteringWrapper<xpc::XrayWrapper<js::SecurityWrapper<js::CrossCompartmentWrapper>, xpc::DOMXrayTraits>, xpc::CrossOriginAccessiblePropertiesOnly>::defaultValue which calls xpc::XrayWrapper<js::SecurityWrapper<js::CrossCompartmentWrapper>, xpc::DOMXrayTraits>::defaultValue which does a js::DefaultValue, which lands us in xpc::FilteringWrapper<xpc::XrayWrapper<js::SecurityWrapper<js::CrossCompartmentWrapper>, xpc::DOMXrayTraits>, xpc::CrossOriginAccessiblePropertiesOnly>::enter which ends up in xpc::CrossOriginAccessiblePropertiesOnly::check for the "toString" id, which is what throws.

What used to happen instead is that the exception on the JSContext wasn't a WebIDL "Exception" object but some sort of XPConnect object (with class "XPCWrappedNative_NoHelper").  This has no PreCreate hook, so wrapping into a different compartment simply creates a brand-new reflector, not a cross-compartment wrapper.  The result is that we used to happily rewrap the underlying C++ exception object into the new compartment, but now we end up creating a security wrapper which denies access.

It seems a bit weird to me to force ourselves into the default compartment of cx when reporting an exception.  Could we just do it in the compartment of the exception object, if the exception is an object?
Also, we clearly need better test coverage.  :(
Blocks: 911258
Yes. The current webrtc-specific mochitests just test the API validation errors which are all done in PeerConnection.js. Jesse gets past those and into the c++ code at present through some obscure holes in the JS validation code that were left for the c++ code to handle. This creates an opportunity and we should clearly make mochitests that provoke these (starting with his testcase).
Yeah, I don't mean webrtc test coverage, necessarily.  We need binding test coverage somehow.  :(
(In reply to Jan-Ivar Bruaroey [:jib] from comment #13)
> The queue exists primarily to protect our internal state machine from bad
> user code, and well-functioning code tends not to rely on it, so lets leave
> the queue case alone for now.

This isn't really correct. The queue is a basic API feature and there's nothing
wrong with content code relying on it being there. It's true that it was introduced
because it simplified implementation, but that doesn't change the above.
Hence primarily and tends. All usage I've seen chains subsequent calls to callbacks, avoiding the queue, likely because we've defined it so that calls typically rely on the result of the previous call.

I meant leave it alone here because it is not part of the regression, and not what people hit first. I agree that it is part of the API and think we should open a separate bug to construct mochitests out of queue-relying use-cases to validate that it works. Since I can't think of any realistic use-cases that would rely on the queue I would welcome others to open this bug with ideas.
Actually, I can think of one right now, but only because we decided to queue getStats, which may have been a mistake. In any case we should probably take this someplace else.
Incidentally, should JSContext::restoreFrameChain assert that enterCompartmentDepth_ == 0?  Otherwise, you get weird asserts/crashes later on due to too many compartment exits, as I discovered the hard way...
Attachment #8385588 - Flags: review?(bobbyholley)
Assignee: docfaraday → bzbarsky
Status: NEW → ASSIGNED
Whiteboard: [fuzzblocker] → [fuzzblocker][need review]
Comment on attachment 8385588 [details] [diff] [review]
Fix error reporting for unintended XPConnect exceptions thrown by JS-implemented webidl to actually work correctly.

Review of attachment 8385588 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsJSUtils.cpp
@@ +105,5 @@
>    if (JS_IsExceptionPending(aContext)) {
>      bool saved = JS_SaveFrameChain(aContext);
>      {
> +      // JS_SaveFrameChain set the compartment of aContext to null, so we need
> +      // to enter a compartment.  The question is, which one?  We don't want to

Nit - extra space

@@ +114,5 @@
> +      // JSErrorReport to the error reporter on aContext, which might expose
> +      // information from it to script via onerror handlers.  So it's very
> +      // important that the duck typing happen in the same compartment as the
> +      // onerror handler.  In practice, that's the compartment of the window (or
> +      // otherwise default global) of aContext, so use that here.

Thanks for writing this up.

::: dom/bindings/CallbackObject.cpp
@@ +211,5 @@
>      if ((mCompartment && mExceptionHandling == eRethrowContentExceptions) ||
>          mExceptionHandling == eRethrowExceptions) {
>        // Restore the old context options
>        JS::ContextOptionsRef(mCx) = mSavedJSContextOptions;
>        mErrorResult.MightThrowJSException();

So, IIUC, ThrowJSException will accept an argument in any compartment, and it gets lazily wrapped when accessed? Probably worth documenting that in ErrorResults.h.
Attachment #8385588 - Flags: review?(bobbyholley) → review+
(In reply to Boris Zbarsky [:bz] (reviews will be slow; ask someone else) from comment #23)
> Incidentally, should JSContext::restoreFrameChain assert that
> enterCompartmentDepth_ == 0?  Otherwise, you get weird asserts/crashes later
> on due to too many compartment exits, as I discovered the hard way...

That sounds right. Please do so if you have time.
Flags: needinfo?(bobbyholley)
Hmm.  Should I just be using AutoHideScriptedCaller here instead of saving/restoring frame chains?
Flags: needinfo?(bobbyholley)
(In reply to Boris Zbarsky [:bz] (reviews will be slow; ask someone else) from comment #26)
> Hmm.  Should I just be using AutoHideScriptedCaller here instead of
> saving/restoring frame chains?

Is the only reason here to make the DescribeScriptedCaller call in NS_ScriptErrorReporter return false? If so, that should work, yes.
Flags: needinfo?(bobbyholley)
As far as I know, that's the only reason, yes.
> As far as I know, that's the only reason, yes.

The test suite says I don't know enough.  In particular, ReportError() over in the JS engine has this nasty JS_IsRunning(cx) check which makes it silently ignore the JS_ReportPendingException() call.  So we do in fact need to JS_SaveFrameChain here.
> That sounds right. Please do so if you have time.

I took that plus a green try run as an r+.  ;)

https://hg.mozilla.org/integration/mozilla-inbound/rev/abcbfb8e7bc6
Flags: in-testsuite?
Whiteboard: [fuzzblocker][need review] → [fuzzblocker]
https://hg.mozilla.org/mozilla-central/rev/abcbfb8e7bc6
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Component: WebRTC → DOM
Now bug 978617 gives me two lines of console output. This is better. Thanks.

System JS : ERROR resource://gre/components/PeerConnection.js:337 - NS_ERROR_FAILURE:

JavaScript error: file:///Users/jruderman/Desktop/b.html, line 13: NS_ERROR_UNEXPECTED:
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.