Closed Bug 1238128 Opened 8 years ago Closed 8 years ago

Ensure strings are passed from content to chrome when using WebChannel

Categories

(Firefox :: Firefox Accounts, defect, P2)

defect

Tracking

()

RESOLVED FIXED
Firefox 50
Tracking Status
firefox46 --- affected
firefox50 --- fixed

People

(Reporter: oyiptong, Assigned: tcsc)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxa] [fxsync][fxa-waffle])

Attachments

(1 file)

In a chat about WebChannel, Bobby found that there is unnecessary risk encountered when sending messages from content to chrome using WebChannel.

The core issue is that the code that listens to the custom events takes untrusted and unvalidated content objects and pass them directly to privileged code.

Specifically, https://dxr.mozilla.org/mozilla-central/rev/3f780f4b14acab29b16bc4141a44180a8af9dd08/toolkit/content/browser-content.js#674

Bobby suggests to enforce the use of strings when communicating from content to chrome.

The current communication from chrome -> content is still fine with the use of Cu.cloneInto
To be clear, we _do_ have Object Xrays, so this is _probably_ fine. But there's a lot of complicated machinery involved there, and all of these consumers just want to pass things of the form { command: fxaccounts:dosomething, data { email: ... } }. JSON strings work just fine for that, and are a lot more fool-proof.
This works for Firefox for Android, and is a fine line to draw in the sand.  When I saw this, I was surprised, but I actually thought it was by design!
Flags: firefox-backlog+
Whiteboard: [fxa] → [fxa] [fxsync]
To experiment, I modified the channel test to try and send:

>        message: {
>          func: function() {alert('foo')},
>          location: document.location,
>          object: {
>            foo: "bar",
>            callme: function() {},
>          },
>        }

And the results are:
* func doesn't make it - there's a console message:
> 10 INFO Console message: [JavaScript Warning: "XrayWrapper denied access to property "func" (reason: value is callable). ...

* location makes it as strings.

* object.foo makes it, object.callme doesn't but no warnings are logged.

While a quick look implies no consumers of this require nested objects, it seems a shame to not allow {sub: {foo: "foo", bar: "bar"}} and numbers - otherwise we are just asking for the consumers to JSON.stringify() richer objects before sending them and the chrome code to JSON.parse() them.

Would it be enough to do a JSON.parse(JSON.stringify()) dance on the data being passed? We end up with the same results as above (although the XrayWrapper message is now printed on the JSON dance rather than sendAsyncMessage.)
(In reply to Mark Hammond [:markh] from comment #3)
> To experiment, I modified the channel test to try and send:
> 
> >        message: {
> >          func: function() {alert('foo')},
> >          location: document.location,
> >          object: {
> >            foo: "bar",
> >            callme: function() {},
> >          },
> >        }
> 
> And the results are:
> * func doesn't make it - there's a console message:
> > 10 INFO Console message: [JavaScript Warning: "XrayWrapper denied access to property "func" (reason: value is callable). ...
> 
> * location makes it as strings.
> 
> * object.foo makes it, object.callme doesn't but no warnings are logged.
> 
> While a quick look implies no consumers of this require nested objects, it
> seems a shame to not allow {sub: {foo: "foo", bar: "bar"}} and numbers -
> otherwise we are just asking for the consumers to JSON.stringify() richer
> objects before sending them and the chrome code to JSON.parse() them.

That is exactly what I'm asking for this API to require. Can you explain why is is a shame to require that the consumers JSONify any object hierarchies?
 
> Would it be enough to do a JSON.parse(JSON.stringify()) dance on the data
> being passed? We end up with the same results as above (although the
> XrayWrapper message is now printed on the JSON dance rather than
> sendAsyncMessage.)

No. At that point we still have chrome examining a content JS object over Xrays, which is the thing I want us to avoid depending on in new APIs that we introduce and control.

Object Xrays theoretically make it mostly safe to access a non-DOM content object from chrome, but their semantics are very complicated and involved, and were primarily designed to reduce the risk of compatibility problems when they landed. Gabor and I are basically the only people in the world that fully understand their semantics, which makes it very hard for other people to determine whether the interactions are secure.

The boundary between content and chrome is where security bugs happen, and is the thing that we need to audit carefully. The most brain-dead way to make that audit tractable is to just pass strings over the boundary and treat the strings as untrusted, which is why I'm advocating for that approach here.
Flags: needinfo?(markh)
(In reply to Bobby Holley (busy) from comment #4)
> (In reply to Mark Hammond [:markh] from comment #3)
...
> > While a quick look implies no consumers of this require nested objects, it
> > seems a shame to not allow {sub: {foo: "foo", bar: "bar"}} and numbers -
> > otherwise we are just asking for the consumers to JSON.stringify() richer
> > objects before sending them and the chrome code to JSON.parse() them.
> 
> That is exactly what I'm asking for this API to require. Can you explain why
> is is a shame to require that the consumers JSONify any object hierarchies?

I must have misunderstood your suggestion - I thought you were suggesting browser-content.js iterate over the object and pull out strings to be handed to content, and was under the impression a JSON dance would have the same effect and still leave rich object semantics for consumers without the explicit stringify, but...

> No. At that point we still have chrome examining a content JS object over
> Xrays, which is the thing I want us to avoid depending on in new APIs that
> we introduce and control.

IIUC, looping over the object and picking out strings in browser-content is still "chrome examining a content JS object over Xrays", much like the stringify would be doing anyway, which is why I think I must have misunderstood your suggestion. Where should these semantics be implemented?
Flags: needinfo?(markh) → needinfo?(bobbyholley)
(In reply to Mark Hammond [:markh] from comment #5)
> (In reply to Bobby Holley (busy) from comment #4)
> > (In reply to Mark Hammond [:markh] from comment #3)
> ...
> > > While a quick look implies no consumers of this require nested objects, it
> > > seems a shame to not allow {sub: {foo: "foo", bar: "bar"}} and numbers -
> > > otherwise we are just asking for the consumers to JSON.stringify() richer
> > > objects before sending them and the chrome code to JSON.parse() them.
> > 
> > That is exactly what I'm asking for this API to require. Can you explain why
> > is is a shame to require that the consumers JSONify any object hierarchies?
> 
> I must have misunderstood your suggestion - I thought you were suggesting
> browser-content.js iterate over the object and pull out strings to be handed
> to content, and was under the impression a JSON dance would have the same
> effect and still leave rich object semantics for consumers without the
> explicit stringify, but...
> 
> > No. At that point we still have chrome examining a content JS object over
> > Xrays, which is the thing I want us to avoid depending on in new APIs that
> > we introduce and control.
> 
> IIUC, looping over the object and picking out strings in browser-content is
> still "chrome examining a content JS object over Xrays", much like the
> stringify would be doing anyway, which is why I think I must have
> misunderstood your suggestion. Where should these semantics be implemented?

I'm suggesting that the content code which dispatches the CustomEvent (written by us, but run in an untrusted context) should do the stringifcation.
Flags: needinfo?(bobbyholley)
(In reply to Bobby Holley (busy) from comment #6)
> I'm suggesting that the content code which dispatches the CustomEvent
> (written by us, but run in an untrusted context) should do the
> stringifcation.

Oh - I really did misunderstand - you are saying we need not do anything to enforce this, but instead just treat it as a review-enforced requirement?
Flags: needinfo?(bobbyholley)
I am saying that the chrome code that receives the event should ignore it if typeof evt.data != 'string'.
Flags: needinfo?(bobbyholley)
(In reply to Bobby Holley (busy) from comment #8)
> I am saying that the chrome code that receives the event should ignore it if
> typeof evt.data != 'string'.

Ah, thanks - that makes more sense.

A big issue now is how we make this change sanely. The existing (externally hosted) code that sends these messages will need to handle older firefoxes (that will not understand a simple string) and newer ones (that will reject everything but a simple string.) While we probably need dependent bugs for each of the consumers (remote troubleshooting, FxA, loop - are there others?), it probably makes sense to come up with a common migration strategy first. I'm on PTO for a significant part of the next 2 weeks and I'll think more about it when I get back if this bug hasn't moved by then.
(In reply to Mark Hammond [:markh] from comment #9)
> (In reply to Bobby Holley (busy) from comment #8)
> > I am saying that the chrome code that receives the event should ignore it if
> > typeof evt.data != 'string'.
> 
> Ah, thanks - that makes more sense.
> 
> A big issue now is how we make this change sanely. The existing (externally
> hosted) code that sends these messages will need to handle older firefoxes
> (that will not understand a simple string) and newer ones (that will reject
> everything but a simple string.) While we probably need dependent bugs for
> each of the consumers (remote troubleshooting, FxA, loop - are there
> others?), it probably makes sense to come up with a common migration
> strategy first. I'm on PTO for a significant part of the next 2 weeks and
> I'll think more about it when I get back if this bug hasn't moved by then.

Great, thank you!

FWIW, I'd be fine with making exceptions for things that are already deployed if the migration would be a pain. I just don't want to continue making less-safe design decisions for new stuff.
For Loop, we could probably alter the channel id for the new variant, and then the standalone page triggers both types of events when it does its detection. It can then record which works, and send other messages with the correct type.

I think this would be relatively easy for us, but I don't know how easy it would be for other areas.
(In reply to Mark Banner (:standard8) from comment #11)
> For Loop, we could probably alter the channel id for the new variant, and
> then the standalone page triggers both types of events when it does its
> detection. It can then record which works, and send other messages with the
> correct type.
> 
> I think this would be relatively easy for us, but I don't know how easy it
> would be for other areas.

:standard8 - what is the `webChannelId` currently used when Hello opens FxA? Do you specify a `context` query parameter as well?

If no `context` is specified, or if `context` is `webchannel`, Hello could open FxA with a new context `webchannel_v2` or similar. The semantics of `webChannelId` should be restricted to identifying which WebChannel to communicate over, whereas `context` is in place to allow us to make decisions about how the communication should occur.
Flags: needinfo?(standard8)
For context, Shane was talking about FxA. Loop has a separate instance of WebChannel usage to FxA.

I've pointed Shane at the Hello code where FxA is used, so hopefully he can answer the question easier.
Flags: needinfo?(standard8)
Bobby reminded me that we should keep working on a path forward for existing WebChannel consumers here.

Since this bug is in Core::FxAccounts, I assume the focus here should be on the use of webchannels in FxA.  A quick grep through the code showed Loop and remote-troubleshooting as the only other apparent consumers of WebChannels, and both also seem to accept rich objects in the message.  Should we make a meta-bug to coordinate work for all three?
Flags: needinfo?(standard8)
Whiteboard: [fxa] [fxsync] → [fxa] [fxsync][fxa-waffle]
Concretely for FxA, from the perspective of web content, we need to know when to send WebChannel messages in the existing format and when to send them in the new stringified format.

As much as I don't want to because it's a maintenance PITA from our side, I think the simplest way to achieve that would be to create new FxA "context" values and update consumers to use them.  Let's do the work to see what that would look like, and we can make further decisions on timelines and implementation from there.

IIUC we currently have the following auth-brokers that use webchannels:

  + WebChannelAuthenticationBroker
  + FxSyncWebChannelAuthenticationBroker
  +--> FxDestkopV2AuthenticationBroker
  +--> FxDestkopV3AuthenticationBroker
  +--> FxFennecV1AuthenticationBroker
  +--> FxFirstRunV1AuthenticationBroker
  +--> FxFirstRunV2AuthenticationBroker

The first is used for OAuth logins in the browser, the second and its subclasses are used for our many variations of signing into sync.  Firefox for iOS doesn't use webchannels AFAICT.

The following query param flags activate one of these brokers:

  * webChannelId=<anything>
  * context=fx_desktop_v2
  * context=fx_desktop_v3
  * context=fx_fennec_v1
  * context=fx_firstrun_v2
  * service=sync&context=iframe

Shane, could you please check my work here?

AFAICT, the only consumer that uses webChannelId=<anything> is Loop, via the FxAccountsOAuthClient interface in the browser.  It would be great to have this:

> I've pointed Shane at the Hello code where FxA is used, so hopefully he can answer the question easier.

Linked in the bug here for full context on how Loop is using it.  But I'm hopefully we could just change it inside FxAccountsOAuthClient and keep it transparent to Loop.

So at a minimum, it looks like we'd have to create new brokers using the new stringified webchannel message format for:

  * context=oauth_webchannel_v2 (as a replacement for webChannelId=<anything>)
  * context=fx_desktop_v4
  * context=fx_fennec_v2
  * context=fx_firstrun_v3

We'd also have to carefully coordinate so that the first-run experience on a context=fx_desktop_v4 browser uses fx_firstrun_v3, otherwise it could wind up triggering messages in the wrong format.  We could mitigate this by having fx_desktop_v4 accept messages in *either* format for a cycle or two.

Doable, but it sounds like a lot of churn.  Proliferating auth-brokers is slowly killing us and we need to stop doing it at some point.
Flags: needinfo?(stomlinson)
Given Comment 10, it's not obvious to me how urgently we should pursue a migration for existing functionality.

If we have a bit of space, we could fix this as part of a move away from one-off `context=foobar` integration, and towards a general handshake approach where we query the browser for its capabilities on page load.

Bobby, given a commitment not to add any *new* WebChannel functionality using non-string payloads, what sort of timeline would you like to see for us upgrading the existing messages to stringified payloads?
Flags: needinfo?(bobbyholley)
(In reply to Ryan Kelly [:rfkelly] from comment #16)
> Bobby, given a commitment not to add any *new* WebChannel functionality
> using non-string payloads, what sort of timeline would you like to see for
> us upgrading the existing messages to stringified payloads?

I'm fine with leaving the current consumers as they are, indefinitely. What I want is for the machinery itself (WebChannel.jsm) to be modified in such a way that it _requires_ stringified payloads, with some legacy opt-out for the existing stuff. A few code changes to tighten things up now can prevent a lot of proliferation of less-safe patterns down the road.
Flags: needinfo?(bobbyholley)
(In reply to Ryan Kelly [:rfkelly] from comment #15)
> 
>   * webChannelId=<anything>
>   * context=fx_desktop_v2
>   * context=fx_desktop_v3
>   * context=fx_fennec_v1
>   * context=fx_firstrun_v2
>   * service=sync&context=iframe
> 
> Shane, could you please check my work here?

That is a correct list, though we can't go back and retrospectively change the format of existing brokers w/o causing breakage.


> 
> AFAICT, the only consumer that uses webChannelId=<anything> is Loop, via the
> FxAccountsOAuthClient interface in the browser.  It would be great to have
> this:
> 
> > I've pointed Shane at the Hello code where FxA is used, so hopefully he can answer the question easier.


I asked standard8 above if Hello can open FxA w/ a context query parameter to help us out.


> 
> Linked in the bug here for full context on how Loop is using it.  But I'm
> hopefully we could just change it inside FxAccountsOAuthClient and keep it
> transparent to Loop.

IIUC the reqs correctly, we'd still have to serialize the message in FxA before sending it via the WebChannel for FxAccountsOAuthClient to handle, so we still need some way of knowing whether the content server should send the new or old format.


> 
> So at a minimum, it looks like we'd have to create new brokers using the new
> stringified webchannel message format for:
> 
>   * context=oauth_webchannel_v2 (as a replacement for
> webChannelId=<anything>)
>   * context=fx_desktop_v4
>   * context=fx_fennec_v2
>   * context=fx_firstrun_v3
> 
> We'd also have to carefully coordinate so that the first-run experience on a
> context=fx_desktop_v4 browser uses fx_firstrun_v3, otherwise it could wind
> up triggering messages in the wrong format.  We could mitigate this by
> having fx_desktop_v4 accept messages in *either* format for a cycle or two.


I think this would be the fastest approach, otherwise we have to land both the content server and firstrun code to handle the browser version bump before the browser code lands. Serializing like this always stretches the timeline.


> 
> Doable, but it sounds like a lot of churn.  Proliferating auth-brokers is
> slowly killing us and we need to stop doing it at some point.


Death by a thousand brokers. It's not fun.
Flags: needinfo?(stomlinson)
Re: query parameter.

If I'm understanding the code right, we're using FxAccountsOAuthClient.launchWebFlow() to open the FxA page - therefore I think you can just add a parameter and we won't notice it.

Its only if you'd need something for the /fxa-oauth/token that we'd need to change I think.

Re: having a meta bug

I think that would be a good idea, otherwise the other ones will get lost.
Flags: needinfo?(standard8)
Priority: -- → P2
Assignee: nobody → tchiovoloni
Status: NEW → ASSIGNED
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

The more the merrier here - it's important we get this right! Asking Bobby for feedback for obvious reasons, Standard8 to get loop's perspective, Margaret for Android, Olivier for RemoteNewTab, and MattN as he always has a good eye for these kinds of patches.
Attachment #8740476 - Flags: feedback?(standard8)
Attachment #8740476 - Flags: feedback?(oyiptong)
Attachment #8740476 - Flags: feedback?(margaret.leibovic)
Attachment #8740476 - Flags: feedback?(bobbyholley)
Attachment #8740476 - Flags: feedback?(MattN+bmo)
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

https://reviewboard.mozilla.org/r/45791/#review42477

::: browser/base/content/test/general/browser_fxa_oauth.html:11
(Diff revision 1)
>  </head>
>  <body>
>  <script>
>    window.onload = function(){
>      var event = new window.CustomEvent("WebChannelMessageToChrome", {
> -      detail: {
> +      detail: JSON.stringify({

I was going to suggest that fxa/loop/etc tests should *not* stringify their events, as the real servers will not yet be stringifying them (ie, so we keep testing as close as possible to what the server actually sends) - but that's going to be a problem as the test servers are not in the whitelist - however, if we use a preference we *could* get away with doing that - possibly even keeping existing tests with objects and one new test in each case that uses strings. I'm fairly relaxed about that TBH, but others may have a different opinion here.

::: browser/base/content/test/general/browser_web_channel.html:43
(Diff revision 1)
>          break;
>      }
>    };
>  
>    function test_generic() {
>      var event = new window.CustomEvent("WebChannelMessageToChrome", {

I think we should add a test here that attempts to send an object and verify it doesn't arrive. browser_web_channel.js already has a hacky technique for checking a message doesn't arrive in iframes, so it shouldn't be too bad.

::: toolkit/content/browser-content.js:675
(Diff revision 1)
>  };
>  FindBar.init();
>  
>  // An event listener for custom "WebChannelMessageToChrome" events on pages.
>  addEventListener("WebChannelMessageToChrome", function (e) {
> +  // XXX Do not expand this list!!

I'm starting to think I gave you bad advice here re hard-coding the list of exceptions. While this should work fine for most use-cases, there are going to be examples where this is a problem - eg, RemoteNewTab has relatively complex code that changes the allowed domain based on whether the production, staging or dev environment is being used, and Firefox Accounts might be problematic for self-hosters. So I'm starting to think that we should use a preference - say a simple string preference we split on whitespace - it's not ideal that people with these special requirements need to know to change such a pref, but it's probably better than screwing them completely. It also gives us a story for tests.

Also, If we assume this lands in 48, it seems that once Firefox 52 is in release (which should be the first ESR with it in), it will be safe to start asking the server owners to make the corresponding change so they unconditionally send strings instead of objects, and the exception for that service could be removed from this list. Even though that is a long time away, I wonder if we should consider getting bugs on file for that future work?

::: toolkit/content/browser-content.js:695
(Diff revision 1)
>    if (e.detail) {
> +    if (typeof e.detail !== 'string') {
> +      // Check if the principal is one of the ones that's allowed to send
> +      // non-string values for e.details.
> +      let objectsAllowed = !!objectWhitelist.find(origin => {
> +        let questionedPrincipal = Services.scriptSecurityManager.createCodebasePrincipalFromOrigin(origin);

Although we eventually expect this code to be the minority of cases, initially every single channel message will hit this block - so I think we should (lazily) build an array of principals to avoid creating them for each message.

Also, in bug 1242857 comment 3, :sicking says "Any time you're calling createCodebasePrincipal you almost always are [wrong]" - this is calling the FromOrigin version, but the same comment might apply; Bobby can clarify here.

::: toolkit/modules/WebChannel.jsm:80
(Diff revision 1)
> +    if (typeof data === 'string') {
> +      // It's allowed not to be a string for a few legacy origins, listed in
> +      // toolkit/content/browser-content.js. In the case that it is a string,
> +      // we need the WebChannel ID, so we have to parse it.
> +      try {
> +        let jsonData = JSON.parse(data);

any reason we can't just |data = JSON.parse(data);|?
Attachment #8740476 - Flags: review?(markh)
> any reason we can't just |data = JSON.parse(data);|?

Nope, I thought it was a little clearer this way that the data would remain unchanged if the parse threw, but its not something I really care about and I'll fix it in the updated patch.

> Although we eventually expect this code to be the minority of cases, initially every single channel message will hit this block - so I think we should (lazily) build an array of principals to avoid creating them for each message.

Absolutely, I actually had wanted to do something like this, but wasn't sure how performance concerned I should be, and didn't want to get told off for making the code less clear with premature optimization. (I did run some simple benchmarks before submitting the patch, and that loop runs in about 100-200us, which is fairly slow but not so horrifying that I couldn't allow myself to ignore it).

> Also, in bug 1242857 comment 3, :sicking says "Any time you're calling createCodebasePrincipal you almost always are [wrong]" - this is calling the FromOrigin version, but the same comment might apply; Bobby can clarify here.

I was not sure the way to do this at all. That bug gets around the issue by using the system principal (afaict at least) which is not applicable for this case.

> I was going to suggest that fxa/loop/etc tests should *not* stringify their events, as the real servers will not yet be stringifying them (ie, so we keep testing as close as possible to what the server actually sends) - but that's going to be a problem as the test servers are not in the whitelist - however, if we use a preference we *could* get away with doing that - possibly even keeping existing tests with objects and one new test in each case that uses strings. I'm fairly relaxed about that TBH, but others may have a different opinion here.

So you're saying I add the test servers to the pref (programmatically via Services.prefs or whatever) before running object versions of the tests? (The test server being example.com, right? Or do you mean some actual server implementing running a testing version of the loop/fxa server)? 

This sounds fairly easy, although it will end up with a lot of duplicated tests (I don't mind doing this, but I could understand if that wasn't desirable for the owners of those modules). It might be sufficient to ensure that objects are allowed if the url is whitelisted (e.g. add it to the pref) in the browser_web_channel test.

I'll wait until I hear what to do about that and the createCodebasePrincipal call, before submitting a revised version of the patch (unless I should submit it without these changes as an intermediate step), but everything else here was easy enough.
https://reviewboard.mozilla.org/r/45791/#review42751

f=me modulo the one issue.

I really appreciate the followup up here. A stitch in time saves nine! :-)

::: toolkit/content/browser-content.js:691
(Diff revision 1)
> +    if (typeof e.detail !== 'string') {
> +      // Check if the principal is one of the ones that's allowed to send
> +      // non-string values for e.details.
> +      let objectsAllowed = !!objectWhitelist.find(origin => {
> +        let questionedPrincipal = Services.scriptSecurityManager.createCodebasePrincipalFromOrigin(origin);
> +        return principal.subsumes(questionedPrincipal);

Why do you need to do a full subsumes check here? Are you actually expecting anything other than a codebase principal? If not, just doing a string comparison with principal.originNoSuffix should be fine, no?
(In reply to Thom Chiovoloni [:tcsc] from comment #23)
> Absolutely, I actually had wanted to do something like this, but wasn't sure
> how performance concerned I should be.

Me either :)

> and didn't want to get told off for
> making the code less clear with premature optimization.

heh - you certainly wouldn't have been "told off" :) But yeah, it's a judgement call - there's no correct answer.

> So you're saying I add the test servers to the pref (programmatically via
> Services.prefs or whatever) before running object versions of the tests?

Yep - although I didn't mean to duplicate all tests - just to copy one of them from each distinct web-channel test and convert it to using a string (or even vice-versa - convert all as you did, but copy one and have it use an object after adjusting the pref).  As I said though, I'm fairly relaxed about it and wouldn't mind if your patch was taken as is - it was more a thought-bubble for others to comment on.

> I'll wait until I hear what to do about that and the createCodebasePrincipal
> call, before submitting a revised version of the patch (unless I should
> submit it without these changes as an intermediate step), but everything
> else here was easy enough.

Waiting for the other comments sounds like the best approach IMO - this bug isn't urgent.

[mid-air collision with Bobby!]

I think Bobby is correct and a string compare would be fine, but let's still wait for other comments to save unnecessary churn]
Sorry, I haven't had time to look yet. Planning on looking at this on Monday.
Attachment #8740476 - Flags: feedback?(bobbyholley) → feedback+
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

https://reviewboard.mozilla.org/r/45791/#review43827

I'm sorry, I don't have time to provide a thorough review, but the /mobile test changes look fine to me.
Attachment #8740476 - Flags: review+
Attachment #8740476 - Flags: feedback?(margaret.leibovic)
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

https://reviewboard.mozilla.org/r/45791/#review43937

(In reply to Nick Alexander :nalexander from comment #2)
> This works for Firefox for Android, and is a fine line to draw in the sand. 
> When I saw this, I was surprised, but I actually thought it was by design!

It was… I didn't realize we weren't supposed to fully trust the Xray wrappers.

::: toolkit/content/browser-content.js:691
(Diff revision 1)
> +    if (typeof e.detail !== 'string') {
> +      // Check if the principal is one of the ones that's allowed to send
> +      // non-string values for e.details.
> +      let objectsAllowed = !!objectWhitelist.find(origin => {
> +        let questionedPrincipal = Services.scriptSecurityManager.createCodebasePrincipalFromOrigin(origin);
> +        return principal.subsumes(questionedPrincipal);

Nit: Most of toolkit/ and browser/ use double-quotes and double-= (not === unless it will make a difference).

::: toolkit/content/browser-content.js:693
(Diff revision 1)
>    let principal = e.target.nodePrincipal ? e.target.nodePrincipal : e.target.document.nodePrincipal;
>  
>    if (e.detail) {
> +    if (typeof e.detail !== 'string') {
> +      // Check if the principal is one of the ones that's allowed to send
> +      // non-string values for e.details.

"e.detail", not "e.details"

FWIW, I agree with the other open issues as well.
Attachment #8740476 - Flags: review+
Attachment #8740476 - Flags: feedback?(MattN+bmo)
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/45791/diff/1-2/
Attachment #8740476 - Flags: review?(markh)
Attachment #8740476 - Flags: feedback?(standard8)
Attachment #8740476 - Flags: feedback?(oyiptong)
Attachment #8740476 - Flags: feedback+
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

https://reviewboard.mozilla.org/r/45791/#review47113

This looks great, thanks!

There are lots of issues here, but all of them are nits, and most of them are in tests, so I don't think I need to look over this again when all the issues are resolved (but as mentioned, I didn't open an issue for every single quote (and possibly missed some spaces around operators) in the tests, but please change all ones introduced by this patch whereever they appear.

::: browser/base/content/test/general/browser_fxa_oauth.html:11
(Diff revision 2)
>  </head>
>  <body>
>  <script>
>    window.onload = function(){
>      var event = new window.CustomEvent("WebChannelMessageToChrome", {
> -      detail: {
> +      // Note: This intentionally sends a string instead of an object, to ensure both work

Given the other one of the "both" is in a different file, it probably can't hurt to have the comment in both files explicitly point to the other file being the "other"

::: browser/base/content/test/general/browser_fxa_oauth.js:312
(Diff revision 2)
>  
>  function test() {
>    waitForExplicitFinish();
>  
>    Task.spawn(function () {
> +    const webchannelWhitelistPref = 'webchannel.allowObject.urlWhitelist';

nit: double quotes are generally preferred

::: browser/base/content/test/general/browser_fxa_oauth.js:314
(Diff revision 2)
>    waitForExplicitFinish();
>  
>    Task.spawn(function () {
> +    const webchannelWhitelistPref = 'webchannel.allowObject.urlWhitelist';
> +    let origWhitelist = Services.prefs.getCharPref(webchannelWhitelistPref);
> +    let newWhitelist = origWhitelist+" http://example.com";

nit: spaces around operators

::: browser/base/content/test/general/browser_fxa_oauth.js:320
(Diff revision 2)
> +    Services.prefs.setCharPref(webchannelWhitelistPref, newWhitelist);
>      for (let test of gTests) {
>        info("Running: " + test.desc);
>        yield test.run();
>      }
> +    Services.prefs.clearUserPref(webchannelWhitelistPref);

I think this should either be in a finally block or moved to using registerCleanupFunction, just on the off chance this tests fails and leaves behind the stale pref value for future tests.

::: browser/base/content/test/general/browser_remoteTroubleshoot.js:10
(Diff revision 2)
>  var {WebChannel} = Cu.import("resource://gre/modules/WebChannel.jsm", {});
>  
>  const TEST_URL_TAIL = "example.com/browser/browser/base/content/test/general/test_remoteTroubleshoot.html"
>  const TEST_URI_GOOD = Services.io.newURI("https://" + TEST_URL_TAIL, null, null);
>  const TEST_URI_BAD = Services.io.newURI("http://" + TEST_URL_TAIL, null, null);
> +const TEST_URI_GOOD_OBJECT = Services.io.newURI('https://'+TEST_URL_TAIL+'?object', null, null);

nit: spaces around operators and double quotes

::: browser/base/content/test/general/browser_remoteTroubleshoot.js:84
(Diff revision 2)
>    // Now a http:// URI - should get nothing even with the permission setup.
>    got = yield promiseNewChannelResponse(TEST_URI_BAD);
>    Assert.ok(got.message === undefined, "should have failed to get any data");
> +
> +  // Check that the page can send an object as well if it's in the whitelist
> +  let webchannelWhitelistPref = 'webchannel.allowObject.urlWhitelist';

more single quotes, and spaces around operators below. I won't call the rest out that are in tests, but please have a check and fix other ones here (I know we are already inconsistent, particularly with single quotes, but we might as well not make that problem worse)

::: browser/base/content/test/general/browser_web_channel.js:47
(Diff revision 2)
>        return new Promise(function(resolve, reject) {
>          let tab;
>          let channel = new WebChannel("twoway", Services.io.newURI(HTTP_PATH, null, null));
>  
>          channel.listen(function (id, message, sender) {
> -          is(id, "twoway");
> +          dump(`

stray dump that should be removed.

::: browser/base/content/test/general/browser_web_channel.js:384
(Diff revision 2)
> +            ok(!sawString);
> +            sawString = true;
> +          } else {
> +            reject(new Error(`Unknown message type: ${message.type}`))
> +          }
> +          dump(`### saw string: ${sawString}, sawObject: ${sawObject}\n`)

stray dump

::: browser/base/content/test/general/browser_web_channel.js:391
(Diff revision 2)
> +            resolve();
> +          }
> +        });
> +      });
> +
> +      const webchannelWhitelistPref = 'webchannel.allowObject.urlWhitelist';

I think we should also use a finally block to reset this pref here.

::: browser/base/content/test/general/test_remoteTroubleshoot.html:4
(Diff revision 2)
>  <!DOCTYPE HTML>
>  <html>
>  <script>
> +function makeDetails(object) {

A short comment here about what it does and why would make sense IMO

::: browser/components/newtab/tests/browser/browser_newtabwebchannel.js:198
(Diff revision 2)
> -  yield replyPromise;
> +  yield newReplyPromise();
>    is(replyCount, 1, "only current channel is listened to for replies");
>  
> +  const webchannelWhitelistPref = 'webchannel.allowObject.urlWhitelist';
> +  let origWhitelist = Services.prefs.getCharPref(webchannelWhitelistPref);
> +  let newWhitelist = origWhitelist+" http://mochi.test:8888";

spaces around operators (and while I'm adding this comment, single quotes above and below)

::: browser/extensions/loop/chrome/test/mochitest/test_loopLinkClicker_channel.html:6
(Diff revision 2)
>  <!DOCTYPE HTML>
>  <html>
>  <script>
>  "use strict";
>  
> +function makeDetails(object) {

copy/paste the same comment here too :)

::: toolkit/content/browser-content.js:673
(Diff revision 2)
>        sendAsyncMessage("Findbar:Mouseup");
>    },
>  };
>  FindBar.init();
>  
> -// An event listener for custom "WebChannelMessageToChrome" events on pages.
> +var WebChannelMessageToChromeListener = {

use let instead of var

::: toolkit/content/browser-content.js:677
(Diff revision 2)
>  
> -// An event listener for custom "WebChannelMessageToChrome" events on pages.
> -addEventListener("WebChannelMessageToChrome", function (e) {
> +var WebChannelMessageToChromeListener = {
> +  // Preference containing the list (space separated) of origins that are
> +  // allowed to send non-string values through a WebChannel, mainly for
> +  // backwards compatability. See bug 1238128 for more information.
> +  URL_WHITELIST_PREF: 'webchannel.allowObject.urlWhitelist',

use double quotes

::: toolkit/content/browser-content.js:712
(Diff revision 2)
>  
> -  if (e.detail) {
> +    if (e.detail) {
> +      if (typeof e.detail != 'string') {
> +        // Check if the principal is one of the ones that's allowed to send
> +        // non-string values for e.detail.
> +        let objectsAllowed = !!this._getWhitelistedPrincipals().find(whitelisted =>

maybe .some() would be better here? No big deal, but it seems like the same semantics and avoids the !!

::: toolkit/modules/WebChannel.jsm:75
(Diff revision 2)
>        browser: event.target,
>        eventTarget: event.objects.eventTarget,
>        principal: event.principal,
>      };
>  
> +    if (typeof data === 'string') {

double quotes

::: toolkit/modules/WebChannel.jsm:77
(Diff revision 2)
>        principal: event.principal,
>      };
>  
> +    if (typeof data === 'string') {
> +      // It's allowed not to be a string for a few legacy origins, listed in
> +      // toolkit/content/browser-content.js. In the case that it is a string,

I think this comment should s/listed/enforced/ - but I think it still reads a little clumsy - maybe something like "data must be a string except for a few legacy origins allowed by browser-content.js" is enough, moved to the line before the check (ie, we don't really need to justify why we parse it here either). Not too bothered by this though.
Attachment #8740476 - Flags: review?(markh) → review+
Oh, I also meant to mention that should also open the followup bugs for eventually being able to remove this code in the future - I guess one bug per "origin" makes sense - I can help you find the relevant components for ones that aren't obvious (but I suspect most should be fairly easy to find)
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/45791/diff/2-3/
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/45791/diff/3-4/
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/45791/diff/4-5/
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/45791/diff/5-6/
Blocks: 1275612
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/45791/diff/6-7/
*sob*

> 00:50:18     INFO -  305 INFO TEST-OK | browser/components/sessionstore/test/browser_windowRestore_perwindowpb.js | took 7602ms
> 00:51:48     INFO -  TEST-INFO | Main app process: killed by SIGKILL
> 00:51:49     INFO -  306 INFO checking window state
> 00:51:49  WARNING -  TEST-UNEXPECTED-FAIL | browser/components/sessionstore/test/browser_windowRestore_perwindowpb.js | application terminated with exit code -9

It's bizarre that this change could cause session restore tests to fail in such a way that the process is killed. I guess we could push everything *except* the test changes to try and see if we still see this failure (we'd expect to see many tests that we did touch to fail, just not this one) and see where we go from there.
This blew up a lot more that sessionstore tests across all platforms. Please give this a very thorough Try push before attempting to re-land.

In case it helps, the Windows version of the failures has some maybe-useful stack traces:
https://treeherder.mozilla.org/logviewer.html#?job_id=28865883&repo=mozilla-inbound#L12191
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/45791/diff/7-8/
Attachment #8740476 - Attachment description: MozReview Request: Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users r?markh → Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users
Attachment #8740476 - Flags: review+
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/45791/diff/8-9/
Comment on attachment 8740476 [details]
Bug 1238128 - Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/45791/diff/9-10/
RyanVM: Does this look alright to you? (only failure I can't get to go away by retrying is on android (rc4), and this code doesn't run on Android) 

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c379a2ac466b
Flags: needinfo?(ryanvm)
If this code doesn't run on Android, I guess it's OK. But I'd rather someone who looks at Treeherder on a regular basis would weigh in.
Flags: needinfo?(ryanvm) → needinfo?(wkocher)
That's a known failure, so you should be fine.
Flags: needinfo?(wkocher)
FTR, the previous patch had a preference observer for the new "webchannel.allowObject.urlWhitelist" pref, which Thom discovered was causing the previous failures. The new version of the patch checks the preference value each time a non-string message is received, but only rebuilds the list of allowed principals when the pref value has actually changed since we last built the list. I feel that's fine for how channels are used, and will eventually be able to be removed completely as the servers specified in the whitelist transition to strings.
Pushed by mhammond@skippinet.com.au:
https://hg.mozilla.org/integration/autoland/rev/af0f02e07f6a
Ensure that the details passed to WebChannelMessageToChrome is a string, with a whitelist for messages from existing users r=Margaret,markh,MattN
https://hg.mozilla.org/mozilla-central/rev/af0f02e07f6a
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Product: Core → Firefox
Target Milestone: mozilla50 → Firefox 50
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: