Fix loads that inherit principals - about:blank and about:srcdoc - and make them go via DocumentChannel
Categories
(Core :: DOM: Navigation, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox82 | --- | fixed |
People
(Reporter: zombie, Assigned: annyG)
References
Details
Attachments
(17 files, 2 obsolete files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
In a mochitest, with a debug build and --enable-fission
:
let win = window.open();
win.location = "http://example.net/browser";
setTimeout(() => win.location = "about:blank", 300); // crash
crashes the content process with
Assertion failure: PermissionAvailable(prin, aType), at z:/build/build/src/extensions/permissions/nsPermissionManager.cpp:2376
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 1•5 years ago
|
||
Updated•5 years ago
|
Comment 2•5 years ago
|
||
zombie, is there a testcase for this? I see you marked the attachment as obsolete, so I assume it isn't that.
Updated•5 years ago
|
Reporter | ||
Comment 3•5 years ago
|
||
https://phabricator.services.mozilla.com/D49427
That's still the test case. I didn't know how to proceed working on this, so I abandoned the revision to get it out of my phabricator dashboard, and apparently that marked it as obsolete here.
Reporter | ||
Updated•5 years ago
|
Comment 4•5 years ago
|
||
I tried and failed to reproduce.
Does one need to do anything special here to reproduce?
Reporter | ||
Comment 5•5 years ago
|
||
Nothing special, just an (artifact) debug build on Windows. I've updated to the latest m-c, applied the patch, ran
mach test test_fission_navigation_assert.html --enable-fission
and it still reproduces. Note however that, at least for me locally, the crash is not immediately obvious from the test summary:
0 INFO TEST-START | Shutdown
1 INFO Passed: 0
2 INFO Failed: 0
I need to grep the log spam to actually find the crash:
A content process crashed and MOZ_CRASHREPORTER_SHUTDOWN is set, shutting down
Reporter | ||
Comment 6•5 years ago
|
||
And here's the crash on a linux debug try run:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=274244170&repo=try&lineNumber=3989-4022
Comment 7•5 years ago
|
||
I tried to reproduce this again, but no luck on Linux/debug.
Updated•5 years ago
|
Comment 8•5 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #7)
I tried to reproduce this again, but no luck on Linux/debug.
I'm not sure if it's the same issue here, but I hit the same assertion running fresh profile when setting privacy.firstparty.isolate = true
in ~/.mozbuild/machrc
.
Also reproducible applying D49427 without touching any prefs.
Comment 9•5 years ago
|
||
(In reply to Gary Chen [:xeonchen] from comment #8)
I'm not sure if it's the same issue here, but I hit the same assertion running fresh profile when setting
privacy.firstparty.isolate = true
in~/.mozbuild/machrc
.
I assume that is without Fission being enabled?
Comment 10•5 years ago
•
|
||
(In reply to Olli Pettay [:smaug] from comment #9)
(In reply to Gary Chen [:xeonchen] from comment #8)
I'm not sure if it's the same issue here, but I hit the same assertion running fresh profile when setting
privacy.firstparty.isolate = true
in~/.mozbuild/machrc
.I assume that is without Fission being enabled?
Correct, fission is disabled by default.
Comment 11•5 years ago
|
||
(In reply to Gary Chen [:xeonchen] from comment #8)
I hit the same assertion running fresh profile when setting
privacy.firstparty.isolate = true
in~/.mozbuild/machrc
.
Nika asked me to needinfo her to determine whether this bug is Fission-specific or not.
Debug assertion failures don't block Fission Dogfooding (M5), but this assertion gets in the way of engineers trying to run tests locally with Fission.
Comment 12•5 years ago
|
||
Nika says this is a bug in about:blank process switching and remote principals. The code assumes that loading about:blank does not process switch, but it currently does. Nika will talk with Matt about using DocumentChannel to load about:blank.
What is the expected principal of about:blank opened in a new popup? Null principal, system principal, or inherit the popup creator's principal? What does the spec say?
Deferring to Fission Nightly (M6) because the fix is not clear and could be a lot of work.
Comment 13•5 years ago
|
||
The root of the issue here is that loading about:blank
inside of a BrowsingContext
loads a document which inherits the principal from the global which triggered the load. This causes an issue, as we don't run about:blank
loads through DocumentChannel
, which means that we don't trigger a process switch to the correct process, and end up loading an about:blank
with a principal which doesn't match the target process.
We need to support performing a process switch in this case so that the final loaded about:blank
document is loaded in the current process.
Comment 14•5 years ago
|
||
Nika says we want all loads to go through DocumentChannel. This code doesn't go through DocumentChannel.
Comment 15•4 years ago
|
||
Anny, this about:blank bug is the same issue as srcdoc loads, so this would be a good bug for you to work on after srcdoc.
about:blank should use DocumentChannel (in some cases).
Tracking for Fission Nightly M6b
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 16•4 years ago
|
||
I am changing how about:blank loads (making it go via DocumentChannel mechanism) which means the load for the blank page is now taking longer and is not instantaneous. This seems to lead to some races in the devtools code. E.g. here https://treeherder.mozilla.org/#/jobs?repo=try&selectedTaskRun=Fu-VfGsTRU25Z7cfPUnzIg.0&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunning%2Cpending%2Crunnable&revision=4db9b9f2c2a709afd15277078ff5463f60e03e30 we have a leftover devtools:toolbox window, even though https://searchfox.org/mozilla-central/rev/d6d8fcc22c3820f2ae08229e0d37be19fba74db9/devtools/client/framework/test/browser_toolbox_toggle.js#103-106 is supposed to wait for the toolbox to be destroyed.
Tracing through the code shows that by the time “toolbox-destroyed” is emitted, it is not necessarily that the devtools toolbox has been closed, as we don’t really wait for the window to close. It seems that previously about:blank load (https://searchfox.org/mozilla-central/rev/d6d8fcc22c3820f2ae08229e0d37be19fba74db9/devtools/client/framework/toolbox.js#3778) was happening much faster and triggered the unload event earlier and by the time “toolbox-destroyed” was emitted, the window happened to be closed.
I thought I could fix it with these changes https://hg.mozilla.org/try/diff/201ce0b7999a465c43682dd7b35bd75e4f905744/devtools/client/framework/devtools.js but while that works for the test above, it creates many more other failures https://treeherder.mozilla.org/#/jobs?repo=try&selectedTaskRun=Uv3TPcDDSBOGqlVIsifhpA.0&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunning%2Cpending%2Crunnable&author=agakhokidze%40mozilla.com
I’m wondering if you have some other suggestions for how I can fix toolbox closing or how I can improve this solution?
Comment 17•4 years ago
|
||
Keeping the need info, but I am looking into this.
The toolbox-host-manager uses the unload
event from the toolbox window to trigger the host destruction.
Could we explicitly ask the host manager to destroy the host from toolbox's destroy instead? Or do we have codepaths which trigger an unload
without going through toolbox destroy (would be surprising)?
From the comments around the about:
blank navigation, it seems we rely on this to release the document.
I don't know why closing the host is not enough though. (:ochameau might have more context but he is currently on PTO)
If the navigation is really necessary, it seems like a workaround, and maybe we could use something else?
So for now, I will try to emit an event from the toolbox destroy and listen to it in the toolbox-host-manager in order to replace the "unload" event mechanism. Let's see what try says: https://treeherder.mozilla.org/#/jobs?repo=try&resultStatus=testfailed%2Cbusted%2Cexception%2Csuccess%2Cusercancel%2Crunning%2Cpending%2Crunnable&revision=ac9699f18bbedc852525789314ce663ef7900000
From other try pushes I did a bit earlier I think the only failure left is devtools/client/webconsole/test/browser/browser_webconsole_multiple_windows_and_tabs.js | Uncaught exception - 2147942487
but I am not sure this is linked to the same issue.
I thought I could fix it with these changes
https://hg.mozilla.org/try/diff/201ce0b7999a465c43682dd7b35bd75e4f905744/devtools/client/framework/devtools.js
but while that works for the test above, it creates many more other failures
https://treeherder.mozilla.org/#/jobs?repo=try&selectedTaskRun=Uv3TPcDDSBOGqlVIsifhpA.0&
If we add this, we should probably remove the unload handler from the toolbox-host-manager, otherwise we have two potential call sites that can trigger a host destroy, and if both kick in only the first one will run successfully. The second one will fail with can't access property "frame", this.host is null
(which I see in a few failed tests in this try run).
Also the additional await here is always worrying, because the toolbox destroy is very fragile and susceptible to races. But overall I think that's the good direction. It's similar to what I am currently doing, but I am using a separate event, fired exactly when we used to navigate to about:blank, in case it matters (as I said, toolbox destroy timing can be very finicky) .
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 19•4 years ago
|
||
Moved the DevTools specific discussion to Bug 1654745.
I pinged Daisuke on the patch attached to this other bug, so clearing the ni? here.
On my try push the only remaining devtools failure seems to be
devtools/client/webconsole/test/browser/browser_webconsole_multiple_windows_and_tabs.js | Uncaught exception - 2147942487
Any idea if this could be related to your change or if this is a devtools issue?
Comment 20•4 years ago
|
||
I don't know what the specific cause is, but figured I'd add that that number corresponds to NS_ERROR_ILLEGAL_VALUE
(or NS_ERROR_INVALID_ARG
, NS_ERROR_INVALID_POINTER
, NS_ERROR_NULL_POINTER
, NS_ERROR_XPATH_INVALID_ARG
- we use that number for a bunch of errors apparently).
Assignee | ||
Comment 21•4 years ago
|
||
I wrote up an explanation of where the error was coming from and then solved it but I'll leave the explanation here!
In the above test we have the following code const tab3 = await addTab(TEST_URI, { window: win2 }); where we call addTab
from devtools/client/shared/test/shared-head.js
, which calls BrowserTestUtils.addTab
where we call tabbrowser.addTab
in the last line. There, we have the following call SessionStore.setLastClosedTabCount(window, 1);
and in SessionStoreInternal.setLastClosedTabCount
we discover that the window does not have __SSi
property so we throw NS_ERROR_INVALID_ARG
.
Then I figured we are probably not waiting long enough for the newly opened window to load here before we try to add a new tab to it, so I modified the test to use BrowserTestUtils.waitForNewWindow()
to wait for the new window to open instead of the "load" event and it worked (at least locally!)
Assignee | ||
Comment 22•4 years ago
|
||
This patch enables sandboxed srcdoc loads to take place via DocumentChannel,
and adds mechanisms for enabling unsandboxed ones.
Both unsandboxed srcdoc, and in subsequent patches, about:blank, loads require
that the triggering principal and the principal to inherit point to the same
instance if the load takes place in the same process as where we are inheriting
those principals from. We save those principals on a target browsing context before
we load the URI, and later, when we are deserializing LoadInfoArgs into
LoadInfo in the content process, we retrieve the saved principals if the
current load identifier of the target BC matches the load identifier saved
along with the principals.
We also need to make sure that during a process switch for about:srcdoc load,
we don't use the original URI for about:srcdoc to determine the remote type and
instead we use channel's result principal.
Assignee | ||
Comment 23•4 years ago
|
||
Depends on D85079
Assignee | ||
Comment 24•4 years ago
|
||
In process selection logic, ensure that we don't use the original URI for
about:blank and instead use the result principal. If the about:blank load has a
null principal, then revert to using the original URI.
Also, remove an extra about:blank load when an nsFrameLoaderOwner is changing
remoteness to prevent races.
Depends on D85080
Assignee | ||
Comment 25•4 years ago
|
||
Depends on D85081
Assignee | ||
Comment 26•4 years ago
|
||
Depends on D85082
Assignee | ||
Comment 27•4 years ago
|
||
Creating more than 2 nested iframes is not allowed and is now enforced for
about:blank loads because they now take place via DocumentChannel.
Depends on D85083
Assignee | ||
Comment 28•4 years ago
|
||
about:blank loads now take place via DocumentChannel and no longer happen
instantenously.
Depends on D85084
Assignee | ||
Comment 29•4 years ago
|
||
Depends on D85085
Updated•4 years ago
|
Assignee | ||
Comment 30•4 years ago
|
||
Depends on D49427
Comment 31•4 years ago
|
||
Comment 32•4 years ago
|
||
Backed out for failure at browser_saveHeapSnapshot_e10s_01.js
Backout link: https://hg.mozilla.org/integration/autoland/rev/bd2b58673114519b572212909ff8af8ed9bc8649
Failure log 1: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=312169375&repo=autoland&lineNumber=15857
Failure log 2: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=312169356&repo=autoland&lineNumber=4799
Failure log 3: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=312170185&repo=autoland&lineNumber=3893
Assignee | ||
Comment 33•4 years ago
•
|
||
After spending too long trying to find out why test testing/web-platform/tests/html/browsers/browsing-the-web/navigating-across-documents/010.html has been timing out (and with the help of my teammates), I figured I'd type up an explanation here, for posterity.
Roughly, the test is supposed to do the following:
- click on the
<a>
element (targetting iframetest
), which has anonclick
handler that will submit a form causing (a) to execute, and anhref
attribute, set to loadhref.html
, where the JS code will do (b). - (a) Note that form's action is
javascript:
URI and the form is targetting iframetest
. The JS code opens an XHR request forblank.html?pipe=trickle(d2)
, and is expecting a delay of 2 seconds from the server before the response is sent back. If the XHR completes before the test has ended (fail case scenario), post a message to the parent with the valuewrite
and do other things. - (b)
href.html
will post a message to the parent with value "href"
The expected scenario is that the XHR will be so slow that href.html
will have a chance to be loaded in the iframe before the XHR completes, and thus the message that will be posted to the parent will be sent by script code in href.html
.
However, now that non-initial (explicit) about:blank loads are taking place via DocumentChannel, it might take longer for about:blank to load. This results in the following scenario as explained to me by Kris:
- When the code inside of the form's action URI starts executing, it is being executed inside of the document that has the initial
about:blank
loaded. Shortly before that, we started loading the explicitabout:blank
page (happens whenever we have an iframe without a<src>
element). The XHR starts, suppress event handling and ends up spinning an event loop. - Then we start loading
href.html
which causes the explicit about:blank load to be cancelled. - We end up reusing about:blank's DocumentViewer for
href.html
load, and because the XHR is associated with the initial about:blank, we don't end up removing it and so it keeps spinning the event loop. If the no-srcabout:blank
load happened faster, the XHR request would have loaded inside of it and when the load forhref.html
began, we wouldn't have reused the document (because it's not an initial one) thus destroying the document and XHR with it. Removing the XHR would cause the event loop to unblock and to unsupress event handling.
The solution is to make sure we wait for onload event before we execute test steps (i.e. clicking on <a>
).
Assignee | ||
Comment 34•4 years ago
|
||
Assignee | ||
Comment 35•4 years ago
|
||
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 36•4 years ago
|
||
Depends on D87328
Assignee | ||
Comment 37•4 years ago
|
||
Depends on D87932
Assignee | ||
Comment 38•4 years ago
|
||
The race occurs when the parent changes the owner process for a BC, but the
child does not know about it and proceeds to call SetCurrentInnerWindowId on a
BC it no longer owns. To fix this, in child process, whenever we call
SetCurrentInnerWindowId on a BC, check that the BC is in process and that the
docshell has not been notified about an upcoming process change.
Depends on D87933
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 39•4 years ago
|
||
Depends on D87934
Assignee | ||
Comment 40•4 years ago
|
||
Depends on D88228
Assignee | ||
Comment 41•4 years ago
|
||
Assignee | ||
Comment 42•4 years ago
|
||
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 43•4 years ago
|
||
Comment 45•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6064f8676546
https://hg.mozilla.org/mozilla-central/rev/cc5151920b08
https://hg.mozilla.org/mozilla-central/rev/20664562b914
https://hg.mozilla.org/mozilla-central/rev/e3be5dbce78d
https://hg.mozilla.org/mozilla-central/rev/baebe0172124
https://hg.mozilla.org/mozilla-central/rev/60305abadb0b
https://hg.mozilla.org/mozilla-central/rev/79d25a5c4998
https://hg.mozilla.org/mozilla-central/rev/a8992824e237
https://hg.mozilla.org/mozilla-central/rev/25825439eae1
https://hg.mozilla.org/mozilla-central/rev/32418f54f50d
https://hg.mozilla.org/mozilla-central/rev/a8edf2857458
https://hg.mozilla.org/mozilla-central/rev/4f885480d348
https://hg.mozilla.org/mozilla-central/rev/ff23c8d322ef
https://hg.mozilla.org/mozilla-central/rev/7f1db08be307
https://hg.mozilla.org/mozilla-central/rev/a617c624c24f
https://hg.mozilla.org/mozilla-central/rev/cbf3b7ea7990
https://hg.mozilla.org/mozilla-central/rev/e763b6e09a49
Description
•