WebSocket constructor fails from the scope of an add-on

RESOLVED FIXED

Status

()

Core
XPConnect
P1
normal
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: mossop, Assigned: bz)

Tracking

(Depends on: 1 bug, {addon-compat, regression})

unspecified
x86_64
All
addon-compat, regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox17 unaffected, firefox18+ fixed, firefox19 unaffected, firefox20+ unaffected)

Details

(URL)

Attachments

(3 attachments)

(Reporter)

Description

5 years ago
Possible a platform issue, but we should investigate this. They are loading a hidden frame and trying to use WebSocket from it, it fails since Firefox 18.

https://forums.mozilla.org/addons/viewtopic.php?f=27&t=11544
My first guess would be: bug 798660

But I will look into it and try to find a workaround for them if the platform fix does not land in time.
Here is the minimal addon to reproduce it (from the forum)

// This is an active module of the krizsanits (2) Add-on
exports.main = function() {

var hiddenFrames = require("hidden-frame");
//var WebSocket = null;
var hiddenFrame = hiddenFrames.add(hiddenFrames.HiddenFrame({
   /**
    * @this {Frame} ?
    */
   'onReady': function() {
      this.element.contentWindow.location.href = require("self").data.url("popup.html");
      var self = this;
       this.element.addEventListener("DOMContentLoaded", function() {
         console.log("Initializing WebSocket for '" + self.element.contentWindow.location + "'");
         var WebSocket = self.element.contentWindow.MozWebSocket || self.element.contentWindow.WebSocket;
         console.log(WebSocket);
         new WebSocket("ws://blah.com"); 
       }, true, true);
   }   
}));


};

I tested it with addon builder and it worked fine for the current release (16) and aurora (17) and was throwing NS_ERROR_FAILURE on nightly from WebSocket::Constructor

nsCOMPtr<nsIScriptGlobalObject> sgo = do_QueryInterface(aGlobal);
if (!sgo) {
  aRv.Throw(NS_ERROR_FAILURE);
  return nullptr;
}

This is a regression from Bug 775368. 
The cause is that we call it from a system sandbox (add-on code) and WebSocket constructor is not prepared for a windowless case (the global is the system sandbox here).

Currently I'm working in a patch bug 784686 , that should fix this issue, and also would provide a nicer way for the AddonSDK users to use WebSocket, however I'm blocked right now with it because of some GC issue, I think fixing that asap and landing that patch would be the right move here.
Summary: WebSocket constructor fails from about:blank page → WebSocket constructor fails from the scope of an add-on
Depends on: 784686
Depends on: 775368
Assignee: nobody → gkrizsanits
OS: Windows 7 → All
This issue is IMO a bit more urgent - Firefox release is now 17 and Firefox 18 ( the version the regression occurs in ) is in Beta.

Gabor: this patch would need to be uplifted to Beta?

Comment 4

5 years ago
bug 784686 is too scary for beta. Could we have some simpler fix?
There seems to be two problems here. One that I described and something else that is broke it on beta, then was fixed on aurora. 

I will check out what was the patch that fixed it for aurora, and maybe we can uplift that patch to Beta. My patch is not even possible to uplift to Beta since it depends on the webidl based WebSocket. On Beta that problem does not even exist yet what it supposed to fix.

Comment 6

5 years ago
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #5)
> My patch is not even possible to uplift to
> Beta since it depends on the webidl based WebSocket.
Er, doesn't beta have webidl based websocket?
Blocks: 818048
(In reply to Olli Pettay [:smaug] from comment #6)
> (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #5)
> > My patch is not even possible to uplift to
> > Beta since it depends on the webidl based WebSocket.
> Er, doesn't beta have webidl based websocket?

That's correct, the current beta is Firefox 18, and it is affected by this problem. Gabor: make sure you're up to date! :D
(In reply to Olli Pettay [:smaug] from comment #6)
> Er, doesn't beta have webidl based websocket?

Hmm... you're right. It must be something else then... The only way this could possible work then if on beta for some reason the global in the Constructor gets the  window that owns the WebSocket js constructor and not the scope the 'new WebSocket' line was executed. Do you remember any change that might have caused this?
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #8)
> (In reply to Olli Pettay [:smaug] from comment #6)
> > Er, doesn't beta have webidl based websocket?
> 
> Hmm... you're right. It must be something else then... The only way this
> could possible work then if on beta for some reason the global in the
> Constructor gets the  window that owns the WebSocket js constructor and not
> the scope the 'new WebSocket' line was executed. Do you remember any change
> that might have caused this?
 
In aurora I mean, sorry for the confusion. It's just weird that it is working in aurora when it's broken in beta and it's broken on nightly again.
last nightly where it's broken: e19e170d2f6d (30. 10.)
first nightly it works: bed18790882f (31. 10.)

If we find the patch that fixed it for aurora compared to the broken beta (which should be in the range above) and the patch can be uplifted to beta that would solve the urgent part of the problem. I will debug later how is it working exactly on aurora this evening. After I looked at the list of patches this might be related to the fix: bug 778152 ? Just guessing, I will continue the investigation.
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e19e170d2f6d&tochange=bed18790882f

bug 778152 feels indeed suspicious.
tracking-firefox18: --- → ?
Keywords: addon-compat
This is confusing. I need some help. As I expected the difference is that the same code calls in different version the constructor with different globals... So the case is that a chrome sandbox has a reference to a content XHR ctor and calls it, and in the binding this particular line
JSObject* obj = JS_GetGlobalForObject(cx, JSVAL_TO_OBJECT(JS_CALLEE(cx, vp)));
(WebSocketBinding.cpp _construtor)
in aurora returns the content window, in beta returns the sandbox.

I'm not sure which is the 'right' version but the fact that it differs from version to version is just bad. I would prefer it to return content in the example since that would not break jetpack, and makes more sense to me...

There are two differences between the call stacks, one is direct proxy, the other is the dom constructor, I'm not sure which caused the change but I'm also not sure what should that line above return (global of the wrapper, or global of the wrapped ctor?)

aurora
---------
xul.dll!mozilla::dom::WebSocketBinding::_constructor(JSContext * cx=0x12587628, unsigned int argc=1, JS::Value * vp=0x068900b0)  Line 727 + 0x14 bytes	C++
xul.dll!mozilla::dom::Constructor(JSContext * cx=0x12587628, unsigned int argc=1, JS::Value * vp=0x068900b0)  Line 210 + 0x9 bytes	C++
mozjs.dll!js::InvokeConstructorKernel(JSContext * cx=0x12587628, JS::CallArgs args={...})  Line 439	C++
mozjs.dll!JS_New(JSContext * cx=0x12587628, JSObject * ctorArg=0x06fe3100, unsigned int argc=1, JS::Value * argv=0x068900a8)  Line 5809 + 0x15 bytes	C++
xul.dll!xpc::DOMXrayTraits::construct(JSContext * cx=0x12587628, JSObject * wrapper=0x1343f1f0, unsigned int argc=1, JS::Value * argv=0x068900a8, JS::Value * rval=0x06890098)  Line 1200 + 0xe bytes	C++
xul.dll!xpc::XrayWrapper<js::Wrapper,xpc::DOMXrayTraits>::construct(JSContext * cx=0x12587628, JSObject * wrapper=0x1343f1f0, unsigned int argc=1, JS::Value * argv=0x068900a8, JS::Value * rval=0x06890098)  Line 1671 + 0x14 bytes	C++
mozjs.dll!js::Proxy::construct(JSContext * cx=0x12587628, JSObject * proxy_=0x1343f1f0, unsigned int argc=1, JS::Value * argv=0x068900a8, JS::Value * rval=0x06890098)  Line 2474 + 0x3f bytes	C++
mozjs.dll!proxy_Construct(JSContext * cx=0x12587628, unsigned int argc=1, JS::Value * vp=0x06890098)  Line 3034 + 0x15 bytes	C++
mozjs.dll!js::InvokeConstructorKernel(JSContext * cx=0x12587628, JS::CallArgs args={...})  Line 456 + 0x9 bytes	C++
mozjs.dll!js::Interpret(JSContext * cx=0x12587628, js::StackFrame * entryFrame=0x06890050, js::InterpMode interpMode=JSINTERP_NORMAL)  Line 2328 + 0xf bytes	C++
mozjs.dll!js::RunScript(JSContext * cx=0x12587600, JS::Handle<JSScript *> script={...}, js::StackFrame * fp=0x06890050)  Line 326 + 0x7 bytes	C++
mozjs.dll!js::InvokeKernel(JSContext * cx=0x12587628, JS::CallArgs args={...}, js::MaybeConstruct construct=NO_CONSTRUCT)  Line 384	C++
mozjs.dll!js::Invoke(JSContext * cx=0x1343b190, const JS::Value & thisv={...}, const JS::Value & fval={...}, unsigned int argc=1, JS::Value * argv=0x06890020, JS::Value * rval=0x0052d14c)  Line 414 + 0x12 bytes	C++
mozjs.dll!js::BaseProxyHandler::call(JSContext * cx=0x12587628, JSObject * proxy=0x136230b0, unsigned int argc=1, JS::Value * vp=0x06890010)  Line 266 + 0x50 bytes	C++
mozjs.dll!js::CrossCompartmentWrapper::call(JSContext * cx=0x12587628, JSObject * wrapper_=0x13623001, unsigned int argc=1, JS::Value * vp=0x06890010)  Line 635 + 0xf bytes	C++
mozjs.dll!proxy_Call(JSContext * cx=0x12587628, unsigned int argc=1, JS::Value * vp=0x06890010)  Line 3026 + 0x2e bytes	C++
mozjs.dll!js::InvokeKernel(JSContext * cx=0x12587628, JS::CallArgs args={...}, js::MaybeConstruct construct=NO_CONSTRUCT)  Line 362 + 0x9 bytes	C++
mozjs.dll!js::Invoke(JSContext * cx=0x1361c3c0, const JS::Value & thisv={...}, const JS::Value & fval={...}, unsigned int argc=1, JS::Value * argv=0x0052d3a4, JS::Value * rval=0x0052d314)  Line 414 + 0x12 bytes	C++
mozjs.dll!JS_CallFunctionValue(JSContext * cx=0x12587628, JSObject * objArg=0x1361c3c0, JS::Value fval={...}, unsigned int argc=1, JS::Value * argv=0x0052d3a4, JS::Value * rval=0x0052d314)  Line 5770 + 0x46 bytes	C++
xul.dll!nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS * wrapper=0x128c6888, unsigned short methodIndex=3, const XPTMethodDescriptor * info_=0x01fc1b48, nsXPTCMiniVariant * nativeParams=0x0052d510)  Line 1434	C++
xul.dll!nsXPCWrappedJS::CallMethod(unsigned short methodIndex=3, const XPTMethodDescriptor * info=0x01fc1b48, nsXPTCMiniVariant * params=0x0052d510)  Line 580 + 0x13 bytes	C++
xul.dll!PrepareAndDispatch(nsXPTCStubBase * self=0x127e3fd8, unsigned int methodIndex=3, unsigned int * args=0x0052d5bc, unsigned int * stackBytesToPop=0x0052d5ac)  Line 85 + 0x15 bytes	C++
xul.dll!SharedStub()  Line 113	C++
xul.dll!nsEventListenerManager::HandleEventSubType(nsListenerStruct * aListenerStruct=0x12779dd4, nsIDOMEventListener * aListener=0x127e3fd8, nsIDOMEvent * aDOMEvent=0x127e1258, nsIDOMEventTarget * aCurrentTarget=0x127aa420, unsigned int aPhaseFlags=4, nsCxPusher * aPusher=0x0052d6f4)  Line 904 + 0x14 bytes	C++


beta
------
xul.dll!mozilla::dom::WebSocketBinding::_constructor(JSContext * cx=0x11b1fc60, unsigned int argc=1, JS::Value * vp=0x065b0098)  Line 582	C++
mozjs.dll!js::InvokeConstructorKernel(JSContext * cx=0x11b1fc60, JS::CallArgs args={...})  Line 436	C++
mozjs.dll!js::Interpret(JSContext * cx=0x11b1fc60, js::StackFrame * entryFrame=0x065b0050, js::InterpMode interpMode=JSINTERP_NORMAL)  Line 2465 + 0xf bytes	C++
mozjs.dll!js::RunScript(JSContext * cx=0x11b1fc00, JS::Handle<JSScript *> script={...}, js::StackFrame * fp=0x065b0050)  Line 324 + 0x7 bytes	C++
mozjs.dll!js::InvokeKernel(JSContext * cx=0x11b1fc60, JS::CallArgs args={...}, js::MaybeConstruct construct=NO_CONSTRUCT)  Line 381	C++
mozjs.dll!js::Invoke(JSContext * cx=0x0faf8190, const JS::Value & thisv={...}, const JS::Value & fval={...}, unsigned int argc=1, JS::Value * argv=0x065b0020, JS::Value * rval=0x001fd314)  Line 411 + 0x12 bytes	C++
mozjs.dll!js::IndirectProxyHandler::call(JSContext * cx=0x11b1fc60, JSObject * proxy=0x0fcdf0b0, unsigned int argc=1, JS::Value * vp=0x065b0010)  Line 456 + 0x50 bytes	C++
mozjs.dll!js::CrossCompartmentWrapper::call(JSContext * cx=0x11b1fc60, JSObject * wrapper_=0x0fcdf001, unsigned int argc=1, JS::Value * vp=0x065b0010)  Line 722 + 0xf bytes	C++
mozjs.dll!proxy_Call(JSContext * cx=0x11b1fc60, unsigned int argc=1, JS::Value * vp=0x065b0010)  Line 3034 + 0x2e bytes	C++
mozjs.dll!js::InvokeKernel(JSContext * cx=0x11b1fc60, JS::CallArgs args={...}, js::MaybeConstruct construct=NO_CONSTRUCT)  Line 360 + 0x9 bytes	C++
mozjs.dll!js::Invoke(JSContext * cx=0x0fcd93c0, const JS::Value & thisv={...}, const JS::Value & fval={...}, unsigned int argc=1, JS::Value * argv=0x001fd568, JS::Value * rval=0x001fd4d8)  Line 411 + 0x12 bytes	C++
mozjs.dll!JS_CallFunctionValue(JSContext * cx=0x11b1fc60, JSObject * objArg=0x0fcd93c0, JS::Value fval={...}, unsigned int argc=1, JS::Value * argv=0x001fd568, JS::Value * rval=0x001fd4d8)  Line 5889 + 0x46 bytes	C++
xul.dll!nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS * wrapper=0x11e691a8, unsigned short methodIndex=3, const XPTMethodDescriptor * info_=0x004e0d48, nsXPTCMiniVariant * nativeParams=0x001fd6d4)  Line 1434	C++
xul.dll!nsXPCWrappedJS::CallMethod(unsigned short methodIndex=3, const XPTMethodDescriptor * info=0x004e0d48, nsXPTCMiniVariant * params=0x001fd6d4)  Line 580 + 0x13 bytes	C++
xul.dll!PrepareAndDispatch(nsXPTCStubBase * self=0x11e56588, unsigned int methodIndex=3, unsigned int * args=0x001fd780, unsigned int * stackBytesToPop=0x001fd770)  Line 85 + 0x15 bytes	C++
xul.dll!SharedStub()  Line 113	C++
xul.dll!nsEventListenerManager::HandleEventSubType(nsListenerStruct * aListenerStruct=0x11d3d564, nsIDOMEventListener * aListener=0x11e56588, nsIDOMEvent * aDOMEvent=0x11f5cec8, nsIDOMEventTarget * aCurrentTarget=0x11d665f8, unsigned int aPhaseFlags=4, nsCxPusher * aPusher=0x001fd8b8)  Line 879 + 0x14 bytes	C++
That's just a result of fixing bug 778152 which puts Xrays around interface and interface prototype objects, so the constructor goes through an Xray now. Aurora is probably correct.
Do you think that patch is safe to be uplifted to Beta?
Relman triage drive-by - waiting for answer on comment 14 to decide if we can/should track this for 18 or wait for a fix to come up from Aurora.
I'd really like to see this uplifted if possible - we have a popular add-on that works in 17 & 19, but breaks in 18.
I think it would be wiser to turn off the new DOM binding for WebSocket instead. Porting bug 778152 is scary, it's a big patch (and as bz points out it did have a couple of regressions).
Although apparently it's not behind a pref, so it might be pretty scary too.
Another option is to spot-fix the WebSocket property on the Xray for a Window to return some function that will then invoke the real underlying constructor.  Something similar to the hackery we already have in Xrays for nodePrincipal and such.

Gabor, does that sound feasible to you?
(In reply to Boris Zbarsky (:bz) from comment #19)
> Another option is to spot-fix the WebSocket property on the Xray for a
> Window to return some function that will then invoke the real underlying
> constructor.  Something similar to the hackery we already have in Xrays for
> nodePrincipal and such.
> 
> Gabor, does that sound feasible to you?

Sounds a bit scary too for beta to be honest. The problem is that the constructor of the WebSocket is realtively complex and making a mistake there can instantly cause a sec crit/serious bug. Last time I tried to do that for mozMatchesSelector which is a trivial function I ended up causing two sec bugs...
Hmm.  mozMatchesSelector was reimplementing the entire quickstub in xrays.

I'm suggesting something much simpler.  Let me try to write it up, I guess.
Actually, so far I just can't reproduce the problem (e.g. by running code in a browser-mode scratchpad)...  I guess I need to run it inside a sandbox or something?  Gabor, are you able to write a testcase I could either scratchpad or install as an extension to test?
Created attachment 689820 [details]
add-on for repro

This is a build of this builder project: https://builder.addons.mozilla.org/package/164566/latest/
Assigning to bz given comment 21, and tracking for release. Sorry, I've read this bug and forum thread a couple of times over - what popular add-on is currently affected?
Assignee: gkrizsanits → bzbarsky
status-firefox17: --- → unaffected
status-firefox18: --- → affected
status-firefox19: --- → unaffected
status-firefox20: --- → affected
tracking-firefox18: ? → +
tracking-firefox20: --- → +
Keywords: regression
(Reporter)

Comment 25

5 years ago
(In reply to Alex Keybl [:akeybl] from comment #24)
> Assigning to bz given comment 21, and tracking for release. Sorry, I've read
> this bug and forum thread a couple of times over - what popular add-on is
> currently affected?

1Password
Created attachment 689961 [details] [diff] [review]
It's a hack, but...
Attachment #689961 - Flags: review?(gkrizsanits)
Attachment #689961 - Flags: review?(bobbyholley+bmo)
Whiteboard: [need review]
We should add a test true, including on trunk, but I wanted to get the reviews started...
Comment on attachment 689961 [details] [diff] [review]
It's a hack, but...

Review of attachment 689961 [details] [diff] [review]:
-----------------------------------------------------------------

Looks ok for what it is. Please write a test before landing though!

r=bholley

::: js/xpconnect/wrappers/XrayWrapper.cpp
@@ +1143,5 @@
> +                    // We can't check that it's the right native, sadly
> +
> +                    // Let's go ahead and create a function in the right
> +                    // compartment.
> +                    MOZ_ASSERT(oldFunction, "What?");

nit - no need for a message if it's not informative.

@@ +1144,5 @@
> +
> +                    // Let's go ahead and create a function in the right
> +                    // compartment.
> +                    MOZ_ASSERT(oldFunction, "What?");
> +                    JSString* functionName = JS_GetFunctionId(oldFunction);

nit - JSString *fuctionName.

@@ +1157,5 @@
> +                        newFunction =
> +                            JS_NewFunctionById(cx, native,
> +                                               JS_GetFunctionArity(oldFunction),
> +                                               JS_GetFunctionFlags(oldFunction),
> +                                               nullptr, functionId);

There was a bug recently about nullptr not always doing the right thing for parent when using JSAPI. I don't remember if the fix is on beta or not. Please just pass cx->global() to be safe.
Attachment #689961 - Flags: review?(bobbyholley+bmo) → review+
Created attachment 690033 [details] [diff] [review]
Test that fails without this patch, passes with it, passes on m-c
Attachment #690033 - Flags: review?(bobbyholley+bmo)
https://tbpl.mozilla.org/?tree=Try&rev=533f16542efe
Attachment #690033 - Flags: review?(bobbyholley+bmo) → review+
Thanks a lot Boris, I couldn't have written this patch this fast for sure. Sorry about the test I was having a day off and was not close to my computer.
No worries.  I just couldn't reproduce for a bit because the sandboxes I was trying to use all had a window as the principal holder, which was what confused me.  ;)
Comment on attachment 689961 [details] [diff] [review]
It's a hack, but...

Review of attachment 689961 [details] [diff] [review]:
-----------------------------------------------------------------

Tested the original example as well just in case (the sandbox there is chrome), and it works fine. Looks good to me. One question, are we going to land this only on beta, or shall I create some clean up bug for it?
Attachment #689961 - Flags: review?(gkrizsanits) → review+
Component: General → XPConnect
Product: Add-on SDK → Core
Comment on attachment 689961 [details] [diff] [review]
It's a hack, but...

We should land this only on beta.  It would make no sense on the other channels...  The test should go on all branches; I'll work on landing that.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Bug 775368
User impact if declined: Various add-ons no longer work properly.
Testing completed (on m-c, etc.): Passes the various testcases involved as well
   as automated tests.
Risk to taking this patch (and alternatives if risky): Medium risk.  The
   alternatives are backing out WebIDL WebSocket or backporting the Xray
   changes for interface objects, which are both riskier, we think.  This
   change is scoped to only affect cases when a WebSocket is being created in a
   sandbox, which is the case that's broken right now anyway, so hopefully the
   risk is manageable.
String or UUID changes made by this patch: None.
Attachment #689961 - Flags: approval-mozilla-beta?
Pushed test to inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/14c80343a9f6
Whiteboard: [need review] → [leave open][need approval]
Pushed test to aurora: https://hg.mozilla.org/releases/mozilla-aurora/rev/44aee6e63829
Comment on attachment 689961 [details] [diff] [review]
It's a hack, but...

Given the lack of good alternatives, and the popularity of this known affected add-on, this risk is manageable. Let's land now for Beta 4. We'll still have ~3 weeks to react to fallout.

Thanks for jumping on this so quickly bz.
Attachment #689961 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
https://hg.mozilla.org/releases/mozilla-beta/rev/203d39ae3be6

Fixed, as far as I know.  m-c is not affected by this problem, right?
Status: NEW → RESOLVED
Last Resolved: 5 years ago
status-firefox18: affected → fixed
status-firefox20: affected → unaffected
Resolution: --- → FIXED
Whiteboard: [leave open][need approval]
(In reply to Boris Zbarsky (:bz) from comment #38)
> Fixed, as far as I know.  m-c is not affected by this problem, right?

If I recall it correctly there was a similar issue on m-c for some other reason... I will get back to this, double check and reopen / investigate it soon. Or in case if it's broken on m-c shall I open another bug instead of reopening this one?
Tested it on the latest nightly and it works fine.
https://hg.mozilla.org/mozilla-central/rev/14c80343a9f6
You need to log in before you can comment on or make changes to this bug.