Closed Bug 1601594 Opened 1 year ago Closed 11 months ago

Changes to allow for origin isolation

Categories

(Core :: DOM: postMessage, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla78
Tracking Status
firefox78 --- fixed

People

(Reporter: annevk, Assigned: tt)

References

(Blocks 1 open bug)

Details

Attachments

(5 files, 3 obsolete files)

https://github.com/whatwg/html/issues/5122 sketches out a plan that would eventually allow for origin isolation under certain circumstances.

I think we should make these changes to see if rolling them out as part of "resab" could be successful. There's little risk here as currently sites are not setting COOP+COEP and they would notice something would not work the moment they set them.

These are the changes:

  • Rather than allowing SharedArrayBuffer same-site, we'd only allow messaging it same-origin.
  • In a COOP+COEP (i.e., cross-origin isolation) environment, the document.domain setter returns early (as its first step).

Since this is a speculative change that might have to be rolled back I don't think we want to change web-platform-tests at this point, so we'll have to mark some of them as expected to fail. Google agrees with trying this out as well.

(This does not block enabling things on Nightly, it's strictly additive.)

Tom, I think you're best suited to tackle this. I think this is best done as a change to our existing behind-a-pref implementation of this functionality and then reusing that pref for the required document.domain change. A try run can then identify the tests that end up changing which we can evaluate together. I expect we simply have to mark a set as failing for now, which is fine, and fix them later once it's clear this change can stick and ship to Release.

Flags: needinfo?(ttung)

Tom and I discussed implementation strategies of the first change. There's two different ways to go about it:

  1. Aside from AgentClusterId we'd also compare the origin (which is already part of the message). This would accomplish the stated goal.
  2. We change the way DocGroup (docgroup is 1:1 with AgentClusterId) is keyed within a TabGroup to be origin-based, rather than site-based. This is the more sound approach (and aligns with the theoretical model at https://github.com/whatwg/html/pull/4734), but the side effects are not immediately apparent.

If we ignore Spectre (and we can for the purposes of this decision as long as Fission hasn't shipped) 1 and 2 ought to be identical from a web content perspective. And even in a Fission world we might still not want to always isolate by origin for resource limitation reasons.

I think it would be interesting to do a try run for 2, but if there's indeed strange fallout that would require a lot of time to sort through we should go with 1 and file a follow-up.

Blocks: 1619649

(Tom will update web-platform-tests to ensure tests for same site fail, tests for same origin succeed, and same logic for documennt.domain when he works on this bug)

(In reply to Anne (:annevk) from comment #2)

  1. We change the way DocGroup (docgroup is 1:1 with AgentClusterId) is keyed within a TabGroup to be origin-based, rather than site-based. This is the more sound approach (and aligns with the theoretical model at https://github.com/whatwg/html/pull/4734), but the side effects are not immediately apparent.

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=49156022b19702699e9a0fef0f846b0151002f3b

Assignee: nobody → ttung
Status: NEW → ASSIGNED
Flags: needinfo?(ttung)

I listed things which I am doing/will do in this bug:

  • Change the key for DocGroup
  • Disable SAB/Wasm related, and expected test failures
  • Add some tests for ensuring the same-site is not allowed to postMessage the SAB
  • Throw a security error for the setter of document.domain if the crossOriginIsolated is not true
  • Maybe disable some reasonable failed tests
  • Add some tests to ensure the behavior

Tom, I think we can update the affected SAB/Wasm tests rather than disabling them.

As for document.domain, the idea is to return from the setter (i.e., not throw) at the latest possible point (i.e., after any exceptions it might throw). This is because many sites set document.domain without any real purpose. Letting them continue to set it would be more compatible.

Otherwise that sounds great!

(In reply to Anne (:annevk) from comment #6)

Thanks for the correction!

Tom, I think we can update the affected SAB/Wasm tests rather than disabling them.

Ah, I see. Will do!

As for document.domain, the idea is to return from the setter (i.e., not throw) at the latest possible point (i.e., after any exceptions it might throw). This is because many sites set document.domain without any real purpose. Letting them continue to set it would be more compatible.

Got it! So the value won't be set into document.domain. Should we propagate any warning message to the web console to indicate that?

Then, I think we should just return here. Will check it again tomorrow.

Returning there matches https://github.com/whatwg/html/pull/4734 I think, thanks!

I wonder if Harald has any thoughts on reporting no-ops in the console.

Flags: needinfo?(hkirschner)

This test was used to verify that postMessaging to a same-origin-domain (but
not same-origin) iframe allows them to see each others' modification.

However, we now only allow same-origin to postMessage SharedArrayBuffer. And,
that means same-origin-domain are not allowed to do that. Therefore, this test
is rewrited to verify that postMessage SharedArrayBuffer from a
same-origin-domain iframe would result the receiver (parent) to get a message
error event.

Depends on D70347

Changes for using messagechannel to share the SharedArrayBuffer.

Depends on D70348

Depends on D70350

(In reply to Tom Tung [:tt, :ttung] from comment #9)

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb531c9c686b3b1dc3a6b632997253d9b2c3ff04

Most of the failures are probably caused by the change for the setter of document.domain.

Got it! So the value won't be set into document.domain. Should we propagate any warning message to the web console to indicate that?

I wonder if Harald has any thoughts on reporting no-ops in the console.

Warning on no-op would be good so developers learn about the side effect, similar to how we warn on deprecated API. Can we limit to one warning per page session, like we do for usage of deprecated APIs?

Flags: needinfo?(hkirschner)

(In reply to :Harald Kirschner :digitarald from comment #17)

Warning on no-op would be good so developers learn about the side effect, similar to how we warn on deprecated API. Can we limit to one warning per page session, like we do for usage of deprecated APIs?

I think that's doable. Will address that in other patches. Thanks!


Just for updating my status on this.
I'm trying to address nika's comment in P1.
Something like:

--- a/dom/base/DocGroup.cpp
+++ b/dom/base/DocGroup.cpp
@@ -22,20 +22,22 @@
 #endif                 // defined(XP_WIN)

 namespace mozilla {
 namespace dom {

 AutoTArray<RefPtr<DocGroup>, 2>* DocGroup::sPendingDocGroups = nullptr;

 /* static */
-nsresult DocGroup::GetKey(nsIPrincipal* aPrincipal, nsACString& aKey) {
+nsresult DocGroup::GetKey(nsIPrincipal* aPrincipal, nsACString& aKey,
+                          bool aCrossOriginIsolated) {
   // Use GetBaseDomain() to handle things like file URIs, IP address URIs,
   // etc. correctly.
-  nsresult rv = aPrincipal->GetBaseDomain(aKey);
+  nsresult rv = aCrossOriginIsolated ? aPrincipal->GetSiteOrigin(aKey)
+                                     : aPrincipal->GetBaseDomain(aKey);
   if (NS_FAILED(rv)) {
     // We don't really know what to do here.  But we should be conservative,
     // otherwise it would be possible to reorder two events incorrectly in the
     // future if we interrupt at the DocGroup level, so to be safe, use an
     // empty string to classify all such documents as belonging to the same
     // DocGroup.
     aKey.Truncate();
   }

However, while I ran a wpt test ( testing/web-platform/tests/html/infrastructure/safe-passing-of-structured-data/shared-array-buffers/window-domain-failure.https.sub.html) with my changes, I found that DocGroup::GetKey would be called twice in a content process with a remote type like "webCOOP+COEP=https://web-platform.test" and a site origin using ":https://web-platform.test".

The first time the value of cross origin isolated was false and the second time was true.
And thus, I hit the assertion:
Assertion failure: mDocGroup->MatchesKey(docGroupKey), at /home/shes050117/Work/mozilla-central/dom/base/Document.cpp:3766

I will figure this out when I have time.

Blocks: 1436186

So, I was thinking the fix is to cache the crossOriginIsoalted flag on the Document. However, here is what I found

Timeline ------+----+-------------+----------+--+--------------->
[1] [2] [3] [4] [5]

[1, 3, 5] when GetDocGroup is called from a document.
[2] the document is set to an inner window
[4] the document is detached from the inner window

So, if the GetDocGroup is called between [2] and [4] like [3], then it's not a problem.

In our meeting, Anne pointed out that the cases in [5] should just use the cached flag since it's in the same agent cluster.

I will find out the case for getting docGroup in [1] and post it here.

(In reply to Tom Tung [:tt, :ttung] from comment #19)

So, I was thinking the fix is to cache the crossOriginIsoalted flag on the Document. However, here is what I found

Timeline ------+----+-------------+----------+--+--------------->
[1] [2] [3] [4] [5]

[1, 3, 5] when GetDocGroup is called from a document.
[2] the document is set to an inner window
[4] the document is detached from the inner window

So, if the GetDocGroup is called between [2] and [4] like [3], then it's not a problem.

In our meeting, Anne pointed out that the cases in [5] should just use the cached flag since it's in the same agent cluster.

I will find out the case for getting docGroup in [1] and post it here.

The place that I cache the value of crossOriginIsolated from the inner window is Document::SetScriptGlobalObject. The reason why I choose it is because that function is called when an inner window is set to a document.

Below is the case where someone gets the docgroup from a document and the document's crossOriginIsolated has been set (which means there hasn't had an inner window for the document).

 0:24.78 pid:26770 #01: mozilla::dom::Document::CrossOriginIsolated() const (/home/shes050117/Work/mozilla-central/dom/base/Document.cpp:15332)
 0:24.78 pid:26770 #02: mozilla::dom::Document::GetDocGroupOrCreate() (/home/shes050117/Work/mozilla-central/dom/base/Document.cpp:6750)
 0:24.79 pid:26770 #03: nsNodeInfoManager::Allocate(unsigned long) (/home/shes050117/Work/mozilla-central/dom/base/nsNodeInfoManager.cpp:275)
 0:24.79 pid:26770 #04: nsINode::operator new(unsigned long, nsNodeInfoManager*) (/home/shes050117/Work/mozilla-central/dom/base/nsINode.cpp:121)
 0:24.79 pid:26770 #05: NS_NewHTMLSharedElement(already_AddRefed<mozilla::dom::NodeInfo>&&, mozilla::dom::FromParser) (/home/shes050117/Work/mozilla-central/dom/html/HTMLSharedElement.cpp:25)
 0:24.79 pid:26770 #06: NS_NewHTMLHtmlElement(already_AddRefed<mozilla::dom::NodeInfo>&&, mozilla::dom::FromParser) (/home/shes050117/Work/mozilla-central/dom/html/nsGenericHTMLElement.h:1264)
 0:24.79 pid:26770 #07: nsContentDLF::CreateBlankDocument(nsILoadGroup*, nsIPrincipal*, nsIPrincipal*, nsDocShell*) (/home/shes050117/Work/mozilla-central/layout/build/nsContentDLF.cpp:256)
 0:24.80 pid:26770 #08: nsDocShell::CreateAboutBlankContentViewer(nsIPrincipal*, nsIPrincipal*, nsIContentSecurityPolicy*, nsIURI*, bool, bool, mozilla::dom::WindowGlobalChild*) (/home/shes050117/Work/mozilla-central/docshell/base/nsDocShell.cpp:6472)
 0:24.80 pid:26770 #09: nsDocShell::CreateAboutBlankContentViewer(nsIPrincipal*, nsIPrincipal*, nsIContentSecurityPolicy*) (/home/shes050117/Work/mozilla-central/docshell/base/nsDocShell.cpp:6527)
 0:24.80 pid:26770 #10: nsAppShellService::JustCreateTopWindow(nsIAppWindow*, nsIURI*, unsigned int, int, int, bool, mozilla::AppWindow**) (/home/shes050117/Work/mozilla-central/xpfe/appshell/nsAppShellService.cpp:727)
 0:24.80 pid:26770 #11: nsAppShellService::CreateTopLevelWindow(nsIAppWindow*, nsIURI*, unsigned int, int, int, nsIAppWindow**) (/home/shes050117/Work/mozilla-central/xpfe/appshell/nsAppShellService.cpp:172)
 0:24.80 pid:26770 #12: nsAppStartup::CreateChromeWindow(nsIWebBrowserChrome*, unsigned int, nsIOpenWindowInfo*, bool*, nsIWebBrowserChrome**) (/home/shes050117/Work/mozilla-central/toolkit/components/startup/nsAppStartup.cpp:632)
 0:24.80 pid:26770 #13: nsWindowWatcher::CreateChromeWindow(nsIWebBrowserChrome*, unsigned int, nsIOpenWindowInfo*, nsIWebBrowserChrome**) (/home/shes050117/Work/mozilla-central/toolkit/components/windowwatcher/nsWindowWatcher.cpp:418)
 0:24.81 pid:26770 #14: nsWindowWatcher::OpenWindowInternal(mozIDOMWindowProxy*, char const*, char const*, char const*, bool, bool, bool, nsIArray*, bool, bool, bool, nsDocShellLoadState*, mozilla::dom::BrowsingContext**) (/home/shes050117/Work/mozilla-central/toolkit/components/windowwatcher/nsWindowWatcher.cpp:940)
 0:24.81 pid:26770 #15: nsWindowWatcher::OpenWindow(mozIDOMWindowProxy*, char const*, char const*, char const*, nsISupports*, mozIDOMWindowProxy**) (/home/shes050117/Work/mozilla-central/toolkit/components/windowwatcher/nsWindowWatcher.cpp:293)
 0:24.81 pid:26770 #16: ??? (/home/shes050117/Work/mozilla-central/objdir/dist/bin/libxul.so + 0x888f78a)
 0:24.81 pid:26770 #17: CallMethodHelper::Invoke() (/home/shes050117/Work/mozilla-central/js/xpconnect/src/XPCWrappedNative.cpp:1632)
 0:24.81 pid:26770 #18: CallMethodHelper::Call() (/home/shes050117/Work/mozilla-central/js/xpconnect/src/XPCWrappedNative.cpp:1175)
 0:24.81 pid:26770 #19: XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) (/home/shes050117/Work/mozilla-central/js/xpconnect/src/XPCWrappedNative.cpp:1141)
 0:24.81 pid:26770 #20: XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) (/home/shes050117/Work/mozilla-central/js/xpconnect/src/XPCWrappedNativeJSOps.cpp:946)
 0:24.81 pid:26770 #21: CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) (/home/shes050117/Work/mozilla-central/js/src/vm/Interpreter.cpp:493)
 0:24.82 pid:26770 #22: js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) (/home/shes050117/Work/mozilla-central/js/src/vm/Interpreter.cpp:585)
 0:24.82 pid:26770 #23: InternalCall(JSContext*, js::AnyInvokeArgs const&, js::CallReason) (/home/shes050117/Work/mozilla-central/js/src/vm/Interpreter.cpp:648)
 0:24.82 pid:26770 #24: js::CallFromStack(JSContext*, JS::CallArgs const&) (/home/shes050117/Work/mozilla-central/js/src/vm/Interpreter.cpp:652)
 0:24.82 pid:26770 #25: Interpret(JSContext*, js::RunState&) (/home/shes050117/Work/mozilla-central/js/src/vm/Interpreter.cpp:3323)
 0:24.82 pid:26770 #26: js::RunScript(JSContext*, js::RunState&) (/home/shes050117/Work/mozilla-central/js/src/vm/Interpreter.cpp:465)
 0:24.82 pid:26770 #27: js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) (/home/shes050117/Work/mozilla-central/js/src/vm/Interpreter.cpp:620)
 0:24.83 pid:26770 #28: InternalCall(JSContext*, js::AnyInvokeArgs const&, js::CallReason) (/home/shes050117/Work/mozilla-central/js/src/vm/Interpreter.cpp:648)
 0:24.83 pid:26770 #29: js::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>, js::CallReason) (/home/shes050117/Work/mozilla-central/js/src/vm/Interpreter.cpp:665)

This case looks safe, though? Because it only loaded a blank document.

I think I figure out a solution. I have a patch which passes all the wpt tests for sharedarraybuffer, but it's a mess and I need to polish it.

So what I tried is basically:

  • Move CrossOriginIsolated function from nsGlobalWindowInner to BroswingContext
  • Cached the value for CrossOriginIsolated in the Document so that it would still have it to pass the sanity check for matching the key for the docgroup.

I still need to find out a better way to cache the CrossOriginIsolated in the Document. Now, there are at many places and I believe they can be reduced into one or two.

(In reply to Tom Tung [:tt, :ttung] from comment #21)

I think I figure out a solution. I have a patch which passes all the wpt tests for sharedarraybuffer, but it's a mess and I need to polish it.

Actually, it shouldn't pass all the wpt tests for SAB without P2 the other patches. I will take a deeper look tomorrow.
Here is the try for P1 with addressing comments: https://treeherder.mozilla.org/#/jobs?repo=try&revision=42fa7314fe0c2a90089a97ff701b93a7e2238ebc

Hi nika,

I have a few questions on D70347 and I list them below:

  • Check if it's crossOriginIsolated for DocGroup::GetKey(nsIPrincipal* aPrincipal, nsACString& aKey)

To do that we need to call nsGlobalWindowInner::CrossOriginIsolated in the Document. However, there are cases that the inner window hasn't been set to the document object and the inner window has already been set to nullptr when calling GetDocGroup.

A way that I can think of to fix this is to move CrossOriginIsolated to BrowsingContext. (So that we can know the value of whether is crossOriginIsolated before the inner window is set) And cache the value of CrossOriginIsolated in the Document object. (So that we don't need to worry if the docshell/browsingContext or innerWindow is detach from the document or not)

Does this make sense to you? Or, do you have any suggestions on this?

The check for crossOriginIsolated in the inner window.

  • Use SiteOrigin as a key for DocGroup when it's crossOriginIsolated.

We want to make postMessage only be allowed messaging SharedArrayBuffer in same-origin rather than allowing messaging it same-site. And I find out that using SiteOrigin as a key cannot achieve that. Would you mind elaborating more on why you suggested using SiteOrigin as a key?

Thanks in advance!

Flags: needinfo?(nika)

We discussed a strategy a bit over zoom today, so clearing my needinfo.

Flags: needinfo?(nika)
Severity: normal → N/A
Priority: P2 → P1
Attachment #9148047 - Attachment is obsolete: true
Attachment #9139444 - Attachment is obsolete: true
Attachment #9139445 - Attachment is obsolete: true
Depends on: 1575095

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b7150444b8cc363977ad9a6b34f73272558b8397&selectedTaskRun=DYIeD3TXTiGqYRlj0Apnuw-0

I will only land this patch once we fix bug 1575095. More clearly, only after we fix the issue that the hijacked response that has COOP+COEP set should run in the webCOOP+COEP processes.

This can be verified by not hitting the debug assertion for ensuring the agent cluster id while running html/cross-origin-opener-policy/popup-coop-by-sw-from-coop.https.html

(In reply to Tom Tung [:tt, :ttung] from comment #28)

try with patches in bug 1575095: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4530435be2c4ad29121f0b7b129664ee3828faad

Just to be clear, I ran these failing tests with my patches and the lastest m-c and they all passed on my desktop. So I assume they are all fixed by the newer vesion of patches in 1575095 and I am going to land my patches.

Pushed by ttung@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/df1a98ce0cc6
Move CrossOriginIsolated from nsGlobalWindowInner to BrowsingContext; r=nika
https://hg.mozilla.org/integration/autoland/rev/8a18e1a7a0bf
P1 - Using Origin as the key for DocGroup; r=farre,nika
https://hg.mozilla.org/integration/autoland/rev/6475c58bac8f
P2 - Changes on window-domain-success.https; r=annevk
https://hg.mozilla.org/integration/autoland/rev/904162ee5f5c
P3 - Changes on window-iframe-messagechannel.https.html; r=annevk
https://hg.mozilla.org/integration/autoland/rev/3edbca8b13bb
P4 - Changes on window-similar-but-cross-origin-success.https.sub.html; r=annevk
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/23821 for changes under testing/web-platform/tests
Upstream web-platform-tests status checks passed, PR will merge once commit reaches central.
Upstream PR merged by moz-wptsync-bot
You need to log in before you can comment on or make changes to this bug.