Closed Bug 1319904 Opened 3 years ago Closed 3 years ago

Cannot login to Sync / FxA when browser is set to "Never Remember History"

Categories

(Firefox :: Sync, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
relnote-firefox --- 50+
firefox50 + fixed
firefox51 + verified
firefox52 + fixed
firefox53 + fixed

People

(Reporter: vladikoff, Assigned: rfkelly)

References

Details

(Whiteboard: [OA])

Attachments

(1 file, 1 obsolete file)

## STR

* Fresh Firefox 50 profile. 
* Go to `about:preferences#privacy` and flip the switch to "Never Remember History"
* The thing will ask you to restart Firefox, do so.
* After the restart try to sing into sync via `about:accounts?action=signin&entrypoint=preferences` (prefs or menu bar)
* Fill out the form and press submit

## Expected

* Should sign into sync or ask for email confirmation

## Actual

* Browser Hangs on "Working...": https://i.imgur.com/vNPp3KH.jpg
> * Fresh Firefox 50 profile. 

FWIW I was also able to reproduce this in latest Nightly.
Setting the browser to "Never remember history" adds a "privateBrowsingId" origin attribute to the requesting principal. Under the hood, I'm guessing never remembering history is the same as using private browsing for all windows.

Our origin checking function (http://searchfox.org/mozilla-central/rev/feef954874af9a18168e61a75629a9406b847c53/toolkit/modules/WebChannel.jsm#185-187) compares the URI's `prePath` to the principal's origin, which includes the origin suffix. So we end up comparing "https://accounts.firefox.com" to "https://accounts.firefox.com^privateBrowsingId=1".

We can fix this by checking if `originOrPermission.prePath === requestPrincipal.originNoSuffix`, which ignores origin attributes. But I'd like to make sure this doesn't do the wrong thing if you sign in to FxA in a private browsing window. (What should that do, anyway?)
Flags: needinfo?(markh)
Wow, nice find Kit!  Is this origin-handling behavior new or changed in FF 50?

So I guess what's happening here is, FxA sends the "can_link_account" webchannel message, the browser thinks its from an invalid origin and ignores it, and FxA hangs forever waiting for a reply.

I wonder if we can:

* Have FxA time out after some sensible period if it doesn't hear back from the browser
* Have the browser explicitly reply with an "origin mismatch" error rather than just ignoring the event

> But I'd like to make sure this doesn't do the wrong thing if you sign in to FxA
> in a private browsing window. (What should that do, anyway?)

My instinct is "don't do that" but I would be nice if we can give a clear error message :-)
(In reply to Kit Cambridge [:kitcambridge] from comment #2)
> We can fix this by checking if `originOrPermission.prePath ===
> requestPrincipal.originNoSuffix`, which ignores origin attributes. But I'd
> like to make sure this doesn't do the wrong thing if you sign in to FxA in a
> private browsing window. (What should that do, anyway?)

Hmm, this affects container tabs, too, since they're also implemented as origin attributes ("userContextId"). What should we do if you try to sign in to FxA from a container tab?

(In reply to Ryan Kelly [:rfkelly] from comment #3)
> Wow, nice find Kit!  Is this origin-handling behavior new or changed in FF
> 50?

I think so, yes. `privateBrowsingId` was added in bug 1269361. IIRC, we didn't use a special attribute for private windows before.

> I wonder if we can:
> 
> * Have FxA time out after some sensible period if it doesn't hear back from
> the browser
> * Have the browser explicitly reply with an "origin mismatch" error rather
> than just ignoring the event

Both make sense to me!
> My instinct is "don't do that" but I would be nice if we can give a clear error message :-)

IMHO the WebChannel origins should still always work in Private Mode and we should be able to log in.


Otherwise, I agree that we can provide more better messaging after "Working..." stays there for long periods of time.
For more info, the check in `browser-content.js` uses `principal.originNoSuffix == whitelisted.originNoSuffix`

Ref: http://searchfox.org/mozilla-central/source/toolkit/content/browser-content.js#905
> IMHO the WebChannel origins should still always work in Private Mode and we should be able to log in

Yeah, you're right.  :markh also points out that other things using WebChannels may also break due to this issue.
I can confirm that changing `origin` to `originNoSuffix` in the checks in WebChannel.jsm fixes this issue for me.  I'm attaching the patch for completeness, although obviously it needs tests etc, and I'm probably not the best person to write them...

Paul, :markh suggested you might be a good person to weigh in with opinions on any security implications here.  Does it seem reasonable to use `originNoSuffix` rather than `origin` when checking the source of webchannel messages?  As Vlad points out in Comment 6, there seems to be precedent for it in other files in the codebase.
Attachment #8813952 - Flags: feedback?(ptheriault)
> I'd like to make sure this doesn't do the wrong thing if you sign
> in to FxA in a private browsing window. (What should that do, anyway?)

After sitting on this for a while, I'm not sure any option other than "it logs you in to the browser like normal" makes sense here.  Many of the things that Sync touches are not affected by Private Browsing Mode - for example, you can still edit your bookmarks just fine between normal and private windows.
Duplicate of this bug: 1320109
I think that patch looks fine as channels should, in general, be expected to work in these pb/container contexts.

FxA might make a policy decision to do something different in some of these pb/container cases, in which case it might be helpful to include additional data in the "sendingContext" object so the handler can differentiate these cases. Comment 9 implies we probably will not make such a policy decision (which I also think is fine), so we probably don't need to bother with this now and could add it later if a real use-case appeared.
Flags: needinfo?(markh)
(In reply to Ryan Kelly [:rfkelly] from comment #8)
> Created attachment 8813952 [details] [diff] [review]
> fix-origin-check-in-webchannel.diff
> 
> I can confirm that changing `origin` to `originNoSuffix` in the checks in
> WebChannel.jsm fixes this issue for me.  I'm attaching the patch for
> completeness, although obviously it needs tests etc, and I'm probably not
> the best person to write them...
> 
> Paul, :markh suggested you might be a good person to weigh in with opinions
> on any security implications here.  Does it seem reasonable to use
> `originNoSuffix` rather than `origin` when checking the source of webchannel
> messages?  As Vlad points out in Comment 6, there seems to be precedent for
> it in other files in the codebase.

As discussed, this sounds right to me. I can't think of any negative security implications here - note that containers do not currently segregate bookmarks, history.
Comment on attachment 8814291 [details]
Bug 1319904 - Ignore origin attributes in webchannel origin check.

https://reviewboard.mozilla.org/r/95538/#review95604
Attachment #8814291 - Flags: review?(markh) → review+
Attachment #8813952 - Attachment is obsolete: true
Attachment #8813952 - Flags: feedback?(ptheriault)
Try run showed an eslint error (for which I've pushed a fix) and a couple of failures that seem totally unrelated:

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=96fc93886cdd

:markh, advice on the above?
Flags: needinfo?(markh)
(In reply to Ryan Kelly [:rfkelly] from comment #16)
> :markh, advice on the above?

With the eslint error fixed this looks fine to push.
Flags: needinfo?(markh)
Pushed by rkelly@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f485eaccbb98
Ignore origin attributes in webchannel origin check. r=markh
https://hg.mozilla.org/mozilla-central/rev/f485eaccbb98
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Verified as fixed in FF53.01a in Windows 7 64-bitt, Ubuntu 16.04 32-bit.
Postfix screencast: http://www.screencast.com/t/CT0RwlPWREN
Status: RESOLVED → VERIFIED
Should we uplift to Aurora, Beta, and possibly Release?
Assignee: nobody → rfkelly
Priority: -- → P1
(In reply to Kit Cambridge [:kitcambridge] from comment #21)
> Should we uplift to Aurora, Beta, and possibly Release?

Ah I thought this would do it:

```
status-firefox50:	affected
status-firefox51:	affected
status-firefox52:	affected
status-firefox53:	verified
```

Yes please, we need this in 50.1 if possible. Kit, which buttons do we need to press to get that going?
(In reply to Vlad Filippov :vladikoff from comment #22)
> Yes please, we need this in 50.1 if possible. Kit, which buttons do we need
> to press to get that going?

Cool. Please request the "approval‑mozilla‑aurora", "approval‑mozilla‑beta", and "approval‑mozilla‑release" flags in https://bugzilla.mozilla.org/attachment.cgi?id=8814291&action=edit. It should populate the comment box with an uplift request form.
Comment on attachment 8814291 [details]
Bug 1319904 - Ignore origin attributes in webchannel origin check.

Approval Request Comment
[Feature/Bug causing the regression]: Private browsing / "Never Remember History"
[User impact if declined]: Those with custom browsing settings cannot sign in or sign up for Firefox Sync
[Is this code covered by automated tests?] Yes
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: No 
[List of other uplifts needed for the feature/fix]: Nothing
[Is the change risky?]: No
[Why is the change risky/not risky?]: Not risky, it is a tweak to an origin checked that worked fine in previous versions (before Firefox 50)
[String changes made/needed]: Nothing
Attachment #8814291 - Flags: approval-mozilla-release?
Attachment #8814291 - Flags: approval-mozilla-beta?
Attachment #8814291 - Flags: approval-mozilla-aurora?
Added to Fx50 release notes as a known issue.
Hi Alex, Florin, should we add this as a test case in our testing going forward so we can catch such regressions earlier? Thanks!
Flags: needinfo?(florin.mezei)
Flags: needinfo?(adavis)
(In reply to Ritu Kothari (:ritu) from comment #26)
> Hi Alex, Florin, should we add this as a test case in our testing going
> forward so we can catch such regressions earlier? Thanks!

We definitely want to have a comprehensive set of tests covering Sync and FxA, and this should be in. We'll be talking more about testing of FxA and Sync during our work week (next week).

Keeping the needinfo around as a reminder .
Hi Ritu, I suppose so. 
I'm not sure if we could have foreseen this which makes me wonder if we should now also test other cases relating to browser preferences.
Thanks Florin and Alex. I'll take this fix in 50.1.0.
Comment on attachment 8814291 [details]
Bug 1319904 - Ignore origin attributes in webchannel origin check.

This is a recent regression in 50, approved for all branches.
Attachment #8814291 - Flags: approval-mozilla-release?
Attachment #8814291 - Flags: approval-mozilla-release+
Attachment #8814291 - Flags: approval-mozilla-beta?
Attachment #8814291 - Flags: approval-mozilla-beta+
Attachment #8814291 - Flags: approval-mozilla-aurora?
Attachment #8814291 - Flags: approval-mozilla-aurora+
Duplicate of this bug: 1319346
Hmm, I'm almost 100% sure that this is the wrong way to fix this bug.

What comment 2 explains usually demonstrates that the code which created the principal object in question has not set up the correct OriginAttributes.  OriginAttributes aren't an optional part of the origin checking process that we can just ignore whenever we want to.  In Firefox 50 for the first time we have moved the information about which code comes from a private window to the privateBrowsingId field, and we rely on that in order to make sure that content in private windows is isolated from content in non-private windows, and depending on how WebChannel is being used, this patch has broken that isolation.  :-(

Ryan, did you ever look at where the principal with the 0 privateBrowsingId came from?

Can we please revert this ASAP and fix this properly?  It's really worrying that this was uplifted all the way to release...
Flags: needinfo?(rfkelly)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #34)
> https://hg.mozilla.org/releases/mozilla-release/rev/8376e8db48de

What release will this go into?  One that is coming up this week?  Or has it already gone to release users?
Flags: needinfo?(ryanvm)
50.1 go to build is next week with the intent of shipping on December 13.
Flags: needinfo?(ryanvm)
(In reply to :Ehsan Akhgari from comment #35)
> Ryan, did you ever look at where the principal with the 0 privateBrowsingId
> came from?

Hmm, in http://searchfox.org/mozilla-central/rev/ef3cede9a5b050fffade9055ca274654f2bc719e/toolkit/modules/WebChannel.jsm#189-192, we aren't comparing two principals. We're comparing an `nsIURI` against the page's principal. Are origin attributes still relevant here?
Flags: needinfo?(ehsan)
(In reply to Kit Cambridge [:kitcambridge] from comment #38)
> (In reply to :Ehsan Akhgari from comment #35)
> > Ryan, did you ever look at where the principal with the 0 privateBrowsingId
> > came from?
> 
> Hmm, in
> http://searchfox.org/mozilla-central/rev/
> ef3cede9a5b050fffade9055ca274654f2bc719e/toolkit/modules/WebChannel.jsm#189-
> 192, we aren't comparing two principals. We're comparing an `nsIURI` against
> the page's principal. Are origin attributes still relevant here?

Well, that's only because of the way the code is currently written.  Right now we expect an nsIURI as a second argument, but nsIURI cannot capture all of the information we need to do the check here (i.e., it lacks the OriginAttributes.)

See this caller <http://searchfox.org/mozilla-central/rev/ef3cede9a5b050fffade9055ca274654f2bc719e/browser/components/newtab/NewTabWebChannel.jsm#266> (which I picked at random basically).  Here instead of creating a new nsIURI, we need to create a codebase principal, pass it down to the WebChannel constructor so that there we can do a principal subsumption check.
Flags: needinfo?(ehsan)
> Ryan, did you ever look at where the principal with the 0 privateBrowsingId came from?

I did not; :vladikoff may be able to say more about this.

> Can we please revert this ASAP and fix this properly?

Far be it from me to second-guess your expertise here, if you believe that's necessary then yes, let's do it.  I can coordinate with :markh on the details when starts his day (and setting an ni? on here to flag it).

Whatever the proper fix is, I presume we'll need to do it both here, and in the existing code that uses the same check in http://searchfox.org/mozilla-central/source/toolkit/content/browser-content.js#905

> It's really worrying that this was uplifted all the way to release...

FWIW we did attempt our due diligence here, including a significant amount of out-of-band discussion with :pauljt about any possible security implications.  For my future reference, who else should I have looped in to that discussion?
Flags: needinfo?(rfkelly) → needinfo?(markh)
In general, we should be comparing principals instead of URIs.  This is an issue in frontend code today, where lots of places look at URIs when they should be looking at principals.
(In reply to Ryan Kelly [:rfkelly] from comment #40)
> > Ryan, did you ever look at where the principal with the 0 privateBrowsingId came from?
> 
> I did not; :vladikoff may be able to say more about this.
> 
> > Can we please revert this ASAP and fix this properly?
> 
> Far be it from me to second-guess your expertise here, if you believe that's
> necessary then yes, let's do it.  I can coordinate with :markh on the
> details when starts his day (and setting an ni? on here to flag it).
> 
> Whatever the proper fix is, I presume we'll need to do it both here, and in
> the existing code that uses the same check in
> http://searchfox.org/mozilla-central/source/toolkit/content/browser-content.
> js#905
> 
> > It's really worrying that this was uplifted all the way to release...
> 
> FWIW we did attempt our due diligence here, including a significant amount
> of out-of-band discussion with :pauljt about any possible security
> implications.  For my future reference, who else should I have looped in to
> that discussion?

Sorry Ryan, I did my best. Tanvi would be a start for principals, but honestly I'm not sure. I think ehsan may also be the right person in this case as I think he was guiding the work change private browsing to use origin attributes (which seems related).
Sorry if that read like a jab at you Paul, it definitely wasn't intended as one :-)

Just trying to make clear what we did (and did not) do before requesting uplift here.
So let me attempt to summarize what a proper fix would look like:

* Change the WebChannel constructor to accept a principal object of some sort rather than the current originOrPermission argument
* Update all callers of `new WebChannel()` to pass in a principal appropriate for their security needs:
   * For FxA I'm pretty sure we do in fact want the "allow from this origin regardless of whether it's private browsing mode or not" behaviour; alternately, I'll be happy for someone to explain to me why that's a bad idea.
   * For other callers, we'll have to audit their expectations and construct something appropriate.

Ehsan, does this match what you're expecting here?

I don't yet have a sense of how much surgery that will be in practice.  Are we willing to eat "FxA doesn't work in private browsing or container tabs" if we can't get that implemented for 50.1?
Flags: needinfo?(ehsan)
>    * For other callers, we'll have to audit their expectations and construct something appropriate.

At the very least, I suppose we could ensure that everyone *except* FxA just gets the pre-this-patch behaviour or failing the check in private browsing mode, and follow up on any further changes as we uncover them.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Thanks for taking a look at this, Ryan!

(In reply to Ryan Kelly [:rfkelly] from comment #40)
> > Ryan, did you ever look at where the principal with the 0 privateBrowsingId came from?
> 
> I did not; :vladikoff may be able to say more about this.

OK.  Let me explain how things should work...  We have two places in the code, one is where the principal is coming from, and the other is where we're currently creating nsIURI objects.  We need to verify in the first place that we are either obtaining the principal from a source which already has a valid principal (such as a Document, a Node, etc.) or carefully create a principal using something like createCodebasePrincipal.  In case we're creating a principal, we need to obtain the correct OA for the principal.  We should obtain a principal instead of an nsIURI similarly in the second part.  Then in the WebChannel code, we should check do a principal subsumption check to see whether the principal has access to the other one.  Doing that would make this code safe and correct both currently and in the future depending on what changes we make to OriginAttributes.

> > Can we please revert this ASAP and fix this properly?
> 
> Far be it from me to second-guess your expertise here, if you believe that's
> necessary then yes, let's do it.  I can coordinate with :markh on the
> details when starts his day (and setting an ni? on here to flag it).
> 
> Whatever the proper fix is, I presume we'll need to do it both here, and in
> the existing code that uses the same check in
> http://searchfox.org/mozilla-central/source/toolkit/content/browser-content.
> js#905

Ah yes...  :(  That code is actually doing something a bit worse!  It seems like it already has two principals to compare, and instead of calling subsumes() it's stripping out the OA and compares the raw origin strings...

(Note that in addition to ignoring OAs, such code has other security issues, for example it won't automatically grant access to any code running with the system principal, it doesn't handle null principals correctly and so on.)

> > It's really worrying that this was uplifted all the way to release...
> 
> FWIW we did attempt our due diligence here, including a significant amount
> of out-of-band discussion with :pauljt about any possible security
> implications.  For my future reference, who else should I have looped in to
> that discussion?

It happens to the best of us, I wasn't trying to point fingers.  :-)

The right people for future reference for this stuff probably include (but aren't limited to) myself, tanvi, baku, bholley, bz, smaug.  For anything specific to PB, you should always feel free to reach out to me.
(In reply to Ryan Kelly [:rfkelly] from comment #44)
> So let me attempt to summarize what a proper fix would look like:
> 
> * Change the WebChannel constructor to accept a principal object of some
> sort rather than the current originOrPermission argument

Yep.

> * Update all callers of `new WebChannel()` to pass in a principal
> appropriate for their security needs:

Yes.

>    * For FxA I'm pretty sure we do in fact want the "allow from this origin
> regardless of whether it's private browsing mode or not" behaviour;
> alternately, I'll be happy for someone to explain to me why that's a bad
> idea.

In order to do that, all you need to do is to pass the system principal.  The system principal is a singleton which subsumes all of the other principals in Gecko (aka it has access to everything without any restrictions.)

>    * For other callers, we'll have to audit their expectations and construct
> something appropriate.

Yes.  Note that it is crucial to ensure that we always pass the right principal around, since our security checks all rely on the principals being correct.

> Ehsan, does this match what you're expecting here?
> 
> I don't yet have a sense of how much surgery that will be in practice.  Are
> we willing to eat "FxA doesn't work in private browsing or container tabs"
> if we can't get that implemented for 50.1?

I'm quite certain that for the FxA call site, per above, the system principal would be a good fix...

Are WebChannels exposed to add-ons?  If add-ons use them, then we need to consider how they're using them as well.
Flags: needinfo?(ehsan)
Resetting the release management flags to help RelMan evaluate the new status of the bug.
Tracking this across all current releases. If we can uplift a new fix in time for beta 6 later this week that will give us more time to validate it for 50.1.
Flags: needinfo?(markh)
> In order to do that, all you need to do is to pass the system principal.
>  The system principal is a singleton which subsumes all of the other principals in Gecko
> (aka it has access to everything without any restrictions.)

Without understanding it deeply, this seems too broad to me - we don't want to allow WebChannel messages from anywhere, we want to allow them only from https://accounts.firefox.com, but regardless of whether it's running in Private Browsing or a Container Tab or what-have-you.
(In reply to :Ehsan Akhgari from comment #46)
> Thanks for taking a look at this, Ryan!
> 
> (In reply to Ryan Kelly [:rfkelly] from comment #40)
> > > Ryan, did you ever look at where the principal with the 0 privateBrowsingId came from?
> > 
> > I did not; :vladikoff may be able to say more about this.
> 
> OK.  Let me explain how things should work...  We have two places in the
> code, one is where the principal is coming from, and the other is where
> we're currently creating nsIURI objects.  We need to verify in the first
> place that we are either obtaining the principal from a source which already
> has a valid principal (such as a Document, a Node, etc.) or carefully create
> a principal using something like createCodebasePrincipal.  In case we're
> creating a principal, we need to obtain the correct OA for the principal. 
> We should obtain a principal instead of an nsIURI similarly in the second
> part.  Then in the WebChannel code, we should check do a principal
> subsumption check to see whether the principal has access to the other one. 
> Doing that would make this code safe and correct both currently and in the
> future depending on what changes we make to OriginAttributes.

That makes sense, thanks. However, I'm unclear on how the above would act in the cases we are discussing here - ie, would the check you describe have allow the webchannel for FxA to work correctly in these contexts (ie, private browsing, containers)?

I'm also trying to understand the practical implications the patch as written actually has - ie, how much of this is "we should be doing things differently for a more robust future" vs "we should be doing things differently because the patch as written has actual, identifiable security implications compared to how things work today without that patch"?

> > Whatever the proper fix is, I presume we'll need to do it both here, and in
> > the existing code that uses the same check in
> > http://searchfox.org/mozilla-central/source/toolkit/content/browser-content.
> > js#905
> 
> Ah yes...  :(  That code is actually doing something a bit worse!  It seems
> like it already has two principals to compare, and instead of calling
> subsumes() it's stripping out the OA and compares the raw origin strings...

See bug 1238128 comment 24 - Bobby asked us to change from a .subsumes check to a simple string comparison.

However, that code isn't where we actually check if the content sending the event is from the correct origin for the specific channel, it's only check whether the content is white-listed to allow non-strings to be sent via the message manager (as you will see in that bug). It's messy, but the plan is for that entire block to be removed some time after the next ESR and as the trusted content sites update to sending a string as the event payload instead of an object.

> (Note that in addition to ignoring OAs, such code has other security issues,
> for example it won't automatically grant access to any code running with the
> system principal,

I believe that for web channels we really don't expect to see code with the system principal (and indeed, it would probably imply something very bad if it did).

> it doesn't handle null principals correctly and so on.)

That sounds bad, but it's not clear to me how that would happen in practice. I'm actually not sure what a null principal is, but if that means literally null as the principal, I believe we will correctly fail in that case.

(In reply to :Ehsan Akhgari from comment #47)
> > I don't yet have a sense of how much surgery that will be in practice.  Are
> > we willing to eat "FxA doesn't work in private browsing or container tabs"
> > if we can't get that implemented for 50.1?
> 
> I'm quite certain that for the FxA call site, per above, the system
> principal would be a good fix...

As Ryan mentions, I doubt the system principal is correct as we only trust content from one specific domain - although we want to trust that domain whether it is using PB mode or containers. I imagine the "remote troubleshooting" webchannel has similar requirements - the channel only trusts SUMO, but it should work if the sumo tab was in a non-default container.

IIUC, using the system principal would allow all content to message the FxA web channel, which would be very bad. Me not understanding this proposal correctly is also an obvious possibility :)

> Are WebChannels exposed to add-ons?  If add-ons use them, then we need to
> consider how they're using them as well.

Not that I'm aware of. A quick dxr search shows that the only references are in "Customizied Home Page for Firefox" which is apparently from Mozilla China (addon id 636266), various addons that report themself as "Universal Search desktop FF experiments in addon format" (ids 641250, 641262, 641262, 664656) and "Firefox Hello" (id 676057)

I think we'd be happy to prevent addons trying to use them if that was possible.

TBH though, I'm still not sure where we stand. It sounds like the primary comments here relate to webchannels before this patch landed - which is a good discussion to have, but they have been on the release channel for some time. It's not clear to me how the patch here makes that status-quo worse, nor what change we should make to fix this issue on 50.
> Update all callers of `new WebChannel()` to pass in a principal appropriate for their security needs

For reference, I was able to find the following users of `new WebChannel()` in the codebase:

* The main Firefox Accounts WebChannel
  * this wants to receive messages from a particular origin, regardless of private browsing etc.

* The Android Firefox Accounts WebChannel
  * this should behave the same as above.

* The Firefox Accounts OAuth WebChannel
  * This must only receive messages from a particular origin, but I guess it could plausibly accept further restrictions from the caller.
  * I don't think it has any callers any more though, now that Hello is gone...

* The remote new-tab service (./browser/components/newtab/)
  * This curretly uses a single static allowed origin, read from prefs at startup
  * One could argue that new-tab in private browsing should be isolated from new-tab in the main browser; right it it looks like the origin check would just fail in the same way FxA's fails.

* Remote troubleshooting
  * This passes a string permission "remote-troubleshooting" rather than using an origin check
  * IIUC the logic for checking this bottoms out in nsPermissionManager.cpp, which seems to have its own logic for removing private browsing, userContext etc from the origin check.


:markh are you aware of any others?  I thought uitour was using it, but it appears to have its own custom message type rather than using WebChannels.
Flags: needinfo?(markh)
(In reply to Ryan Kelly [:rfkelly] from comment #54)
> :markh are you aware of any others?  I thought uitour was using it, but it
> appears to have its own custom message type rather than using WebChannels.

Nope, that's all I'm aware of. I also thought uitour leveraged it, but as you say, it uses a hand-rolled message mechanism that winds up using the permission manager to check a URI. ISTM that UITour probably *should* use web channels if for no better reason than to prevent custom event mechanisms for content and chrome code from proliferating.
Flags: needinfo?(markh)
Attachment #8814291 - Attachment is obsolete: true
(In reply to Mark Hammond [:markh] from comment #53)
> 
> > Are WebChannels exposed to add-ons?  If add-ons use them, then we need to
> > consider how they're using them as well.
> 
> Not that I'm aware of. A quick dxr search shows that the only references are
> in "Customizied Home Page for Firefox" which is apparently from Mozilla
> China (addon id 636266), ...

I filed bug 1319346 when we found out our customized new tab is broken in a container tab. We're already using a different and more static new tab in private browsing mode due to lack of IndexedDB support.
> we only trust content from one specific domain - although we want to trust that domain whether
> it is using PB mode or containers.

In fact this is pretty much the entire definition of what the WebChannel interface is for, per [1]:

"""
The WebChannel.jsm JavaScript code module provides an abstraction that uses the Message Manager and Custom Events APIs to create a two-way communication channel between chrome and content code for specific origins.
"""

[1] https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/WebChannel.jsm

> I imagine the "remote troubleshooting" webchannel has similar
> requirements - the channel only trusts SUMO, but it should work if the sumo tab was in a
> non-default container.

I think remote-troubleshooting achieves this already by virtue of using the permission manager, which appears to have code for ignoring private browsing, container tabs, etc:

  http://searchfox.org/mozilla-central/source/extensions/cookie/nsPermissionManager.cpp#105
So I think we're actually a bit stuck here, from Comment 53 on down - setting ni? :Ehsan for more guidance.

I wasn't able to find a principal-based way to express our "allow from this domain, but ignore private browsing etc" requirement.
Flags: needinfo?(ehsan)
> We have two places in the code, one is where the principal is coming from, and the other is where
> we're currently creating nsIURI objects.  We need to verify in the first place that we are
> either obtaining the principal from a source which already has a valid principal (such as a Document,
> a Node, etc.) or carefully create a principal using something like createCodebasePrincipal.

I realized that this may have been what you were asking about in the "did you ever look at..." part of Comment 35, so to clarify: the `requestPrincipal` here always comes from an event object:

  https://dxr.mozilla.org/mozilla-central/rev/cd4cdcc9ad6c45dad8b8d8c0d40e459db2bca8a1/toolkit/modules/WebChannel.jsm#95

Received by chrome code, from content code, via the messagemanager:

  https://dxr.mozilla.org/mozilla-central/rev/cd4cdcc9ad6c45dad8b8d8c0d40e459db2bca8a1/toolkit/modules/WebChannel.jsm#46

So it should always reflect whatever page is trying to send the WebChannelMessageToChrome event, and we should never need to construct a principal for this half of the comparison.  That's the only callsite for the ` _originCheckCallback` affected by the patch here.

> In case we're creating a principal, we need to obtain the correct OA for the principal.

So we don't need to worry about this part.

> We should obtain a principal instead of an nsIURI similarly in the second part.

And this is the part I'm stuck on, trying to express our requirements as a principal of some sort.
(In reply to Ryan Kelly [:rfkelly] from comment #59)
> > We have two places in the code, one is where the principal is coming from, and the other is where
> > we're currently creating nsIURI objects.  We need to verify in the first place that we are
> > either obtaining the principal from a source which already has a valid principal (such as a Document,
> > a Node, etc.) or carefully create a principal using something like createCodebasePrincipal.
> 
> I realized that this may have been what you were asking about in the "did
> you ever look at..." part of Comment 35, so to clarify: the
> `requestPrincipal` here always comes from an event object:
> 
>  
> https://dxr.mozilla.org/mozilla-central/rev/
> cd4cdcc9ad6c45dad8b8d8c0d40e459db2bca8a1/toolkit/modules/WebChannel.jsm#95
> 
> Received by chrome code, from content code, via the messagemanager:
> 
>  
> https://dxr.mozilla.org/mozilla-central/rev/
> cd4cdcc9ad6c45dad8b8d8c0d40e459db2bca8a1/toolkit/modules/WebChannel.jsm#46
> 
> So it should always reflect whatever page is trying to send the
> WebChannelMessageToChrome event, and we should never need to construct a
> principal for this half of the comparison.  That's the only callsite for the
> ` _originCheckCallback` affected by the patch here.
> 
> > In case we're creating a principal, we need to obtain the correct OA for the principal.
> 
> So we don't need to worry about this part.
> 
> > We should obtain a principal instead of an nsIURI similarly in the second part.
> 
> And this is the part I'm stuck on, trying to express our requirements as a
> principal of some sort.

Ah yes, I think I had misunderstood what you were describing, because I didn't know where the two sides of the comparison are coming from.  Furthermore, I didn't understand the peculiar semantics of WebChannel, which intentionally wants to ignore things like whether the origin is in PB or in a container.  That's not something that can be achieved using principals.  Therefore, I think I was wrong in my comment before.  My apologies. :(

Note that this kind of implies that the chrome side of a WebChannel can only hold application global data, since OriginAttributes are used to isolate websites from each other.  I'm not familiar with any of the consumers, but it would be nice to verify that this is true for the consumers.

So I suppose we can now reland the original patches?

(It would be super nice to document the usage of originNoSuffix and why that's needed here in the ctor of WebChannel on trunk at least, for the benefit of the next person who reads this code.  Using originNoSuffix in these kinds of contexts raise red flags for me at least, because in almost every part of the code ignoring the OAs is the wrong thing to do...)
Flags: needinfo?(ehsan) → needinfo?(rfkelly)
> Note that this kind of implies that the chrome side of a WebChannel can only hold application 
> global data, since OriginAttributes are used to isolate websites from each other.  I'm not
> familiar with any of the consumers, but it would be nice to verify that this is true for the consumers.

I believe this to be true of all consumers, but will take some time to confirm.

> It would be super nice to document the usage of originNoSuffix and why that's needed here 
> in the ctor of WebChannel on trunk at least, for the benefit of the next person who reads this code.

Yes, absolutely - I took a fresh look at the patch in isolation yesterday, and I can imagine it raised all sorts of red flags!  My apologies for not making the context clearer both in the commit itself, and in the bugzilla history here, especially given all the possible security implications.

I'll re-work the patch to clarify what's going on and why.
Flags: needinfo?(rfkelly)
> I'll re-work the patch to clarify what's going on and why.

Actually :markh reminds me we should try get the fix into 50 ASAP, so here's what I'll do:

* Un-obsolete the original patch so we can re-land it
* Followup on the docs improvements in a separate patch, in Bug 1321687
Attachment #8814291 - Attachment is obsolete: false
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/05fe18ec6d95
Ignore origin attributes in webchannel origin check. r=markh
Re-marking as RESOLVED/FIXED.  Thanks for your input all, I appreciate us being as thorough as possible about this.
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
What about using BroadcastChannel instead WebChannel? This fits perfectly with "remote new-tab service" case (from comment 54).
Duplicate of this bug: 1322244
Moving this over to Mihai Ninu who will be the owner of testing on the Fx Accounts & Sync side. 

Mihai please verify this fix on Beta 51, and also see comment 27 about adding this sort of scenario and other similar ones to our test cases.
Flags: needinfo?(florin.mezei) → needinfo?(mihai.ninu)
Hi,

This was verified as fixed on FF 51.0b7 Windows 7(64-bit)and on User Agent  Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:51.0) Gecko/20100101 Firefox/51.0 Build ID  20161212025550
In the matter of test cases, these cases will be covered after we speak and prioritize testing with Alex and the team.
Flags: needinfo?(mihai.ninu)
Flags: needinfo?(adavis)
This should have been marked fixed for 53 based on comment 66, sometimes bugherder doesn't work though!
Component: Firefox Sync: UI → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.