Closed Bug 1768522 Opened 2 years ago Closed 2 years ago

Tab Extension pages loaded receive API events and can trigger API method calls after the user navigated the tab to a page running in a different process and Fission is enabled

Categories

(WebExtensions :: General, defect, P2)

defect

Tracking

(firefox102 fixed)

RESOLVED FIXED
102 Branch
Tracking Status
firefox102 --- fixed

People

(Reporter: rpl, Assigned: rpl)

References

Details

(Whiteboard: [addons-jira])

Attachments

(1 file, 1 obsolete file)

The changes applied in Bug 1761828 did make it visible that with Fission enabled

  • the extension pages loaded in a tab and navigated by the user to a page that is expected to be loaded in a different process (e.g. if a webpage is loaded in a tab where an extension newtab page was previously loaded, as in the STR described in Bug 1761828 comment 10) are still notified about the WebExtensions API events they have registered a listener for while their are in the BFCache.

The changes introduced by Bug 1761828 made that visible because ProxyAPIImplementation.prototype.callAsyncFunction is currently accessing context.contentWindow?.windowUtils?.isHandlingUserInput for any async API methods implemented in the parent process (whereas before that change only the smaller subset of methods that require user input where going to do the same)

Accessing context.contentWindow?.windowUtils?.isHandlingUserInput is throwing NS_ERROR_XPC_SECURITY_MANAGER_VETO (see https://bugzilla.mozilla.org/show_bug.cgi?id=1761828#c4) because the page is in the BFCache (confirmed locally using the STR and logging the value set on browserContext.isActive and browserContext.isInBFCache, which shows that the browsing context is considered active but is in the bfcache when the NS_ERROR_XPC_SECURITY_MANAGER_VETO error is being raised).

This patch is currently making sure that context.active isn't going to be true when an extension page
has been moved into the bfcache because the browser element where it was loading into has been navigated
to a page that needs to run in a different process.

The current set of changes would only preventing our WebExtensions API internals from emitting WebExtensions
API events (and also to resolve WebExtensions API async API methods calls that may have been resolved in the
meantime) but technically:

  • it is not yet matching what was actually happening with Fission disabled
  • there may be other things that the bfcached extension page shouldnt' be able to do that may still be
    possible and may be surprising (because they may not match what both the extension and extension developers
    may expect, as well as what our internals currently expects).

In particular, I've confirmed locally (using the STR form the bug mentioned in comment 0) that with
Fission disabled the ExtensionPageContextChild instance related to the extension page in the
tab navigated to a webpage is actually being destroyed (in response to "inner-window-destroyed" being
notified for that inner window id) but that wouldn't still be the case with Fission enabled and
the changes in the current version of this patch.

And so I'm attaching this patch just as a preliminary exploratory WIP patch, while I'll be working on
writing some new test cases and look into the additional changes to make the behavior under fission
to match the previous expected behaviors (and evalute if that would be actually still an appropriate
behavior under fission, what other options we may have).

This patch contains a reduced mochitest-browser testcase which recreates the behaviors observed with the
extension mentioned in comment 0.

The test is:

  • installing a small test extension (mostly empty, no extension API are being used in the extension
    page included in the extension and set as a newtab override).
  • opening a new tab and then retrieving the extension contextId for the newtab extension page loaded
  • navigating the newtab to a webpage running in a web content process
  • opening a second newtab to the extension page and looks for the context with the contextId
    previously got from the first newtab extension page
  • asserting that the extension context related to the first newtab should not exist anymore

Running the test using mach mochitest --disable-fission does pass (because the context has been
destroyed when the first newtab navigated to the webpage), while running it in fission mode fails
(because the context does still exist).

The purpose of this test in its current form is mainly making it easier to investigate the
reasons behind the diferent behavior.

Using the reduced test case attached in comment 2 I have looked into which MOZ_LOG could be useful to get a better picture of which internals are involved in the decision (in both the parent and child processes) to allow the extension page to move into the BFCache and confirm if the current behavior is actually working as expected and our internals should be adjusted (similarly to what the patch attached in comment 1) to account for that.

The following MOZ_LOG ids have been quite useful to follow what was going on in the parent and child process: "SHIPBFCache" has been especially useful to follow the Session In Parent pieces (along with it during the investigation I also found helpful the "BrowsingContext" and "BrowsingContextSync" ones).

Determine where the decision is happening

e.g. the following logs are emitted by using MOZ_LOG="SHIPBFCache:5,sync" while executing the reduced test case:

[rr 724670 171333][Parent 724670: Main Thread]: D/SHIPBFCache Checking moz-extension://f626455a-4631-4f4d-89e3-79a44482a751/newtab.html
[rr 724670 171337][Parent 724670: Main Thread]: D/SHIPBFCache  +> moz-extension://f626455a-4631-4f4d-89e3-79a44482a751/newtab.html shouldn't be blocked from going into the BFCache
[rr 724670 171341][Parent 724670: Main Thread]: V/SHIPBFCache Adding request about:document-onload-blocker to loadgroup for chrome://browser/content/browser.xhtml
[rr 724670 171345][Parent 724670: Main Thread]: V/SHIPBFCache Loadgroup for chrome://browser/content/browser.xhtml has multiple requests relevant for blocking BFCache
[rr 724670 171349][Parent 724670: Main Thread]: D/SHIPBFCache nsFrameLoaderOwner::ChangeRemotenessCommon: store the old page in bfcache
[rr 724670 171353][Parent 724670: Main Thread]: D/SHIPBFCache nsSHistory::EvictOutOfRangeContentViewers 0
...
[rr 724819 171541][Child 724819: Main Thread]: D/SHIPBFCache Document::OnPageHide moz-extension://f626455a-4631-4f4d-89e3-79a44482a751/newtab.html persisted=1
...
[rr 724670 171852][Parent 724670: Main Thread]: D/SHIPBFCache Setting BFCache flags for moz-extension://f626455a-4631-4f4d-89e3-79a44482a751/newtab.html +(SUSPENDED) -(0)
[rr 724819 171876][Child 724819: Main Thread]: V/SHIPBFCache Removing request documentchannel:http://example.com/ from loadgroup for moz-extension://f626455a-4631-4f4d-89e3-79a44482a751/newtab.html
[rr 724819 171880][Child 724819: Main Thread]: V/SHIPBFCache Loadgroup for moz-extension://f626455a-4631-4f4d-89e3-79a44482a751/newtab.html doesn't have any requests relevant for blocking BFCache

Based on the logs above:

  • CanonicalBrowsingContext::AllowedInBFCache is where we determine if a certain browsing context is allowed to go into the BFCache, as it can be seen by looking to the method on searchfox we don't special case moz-extension urls and so we are treating the top level extension page as we would be doing for a regular webpage (which actually sounds pretty reasonable)
  • nsFrameLoaderOwner::ChangeRemotenessCommon then calls bc->SetIsInBFCache(true); in the parent process (which is then synced as expected in the child process, as I did notice when I looked to the logs emitted using "BrowsingContextSync" MOZ_LOG before)

Confirm that the extension page is also being evicted from the bfcache as expected

Another detail that I was interested into confirming explicitly was that the extension page entering into the BFCache was then going to be evicted as expected and it wouldn't stay around forever.

I confirmed that manually by using a small extension similar to the one in the reduced test case.

Once the history grows a bit more by navigating the tab on some more pages (after the extension page got into the BFCache), the eviction is triggered as the logs emitted using MOZ_LOG="SHIPBFCache:5,sync" also confirms (I added a "***** DEBUG .... *****" dump to ExtensionPageContextChild to more easily pinpoint where that was happening, because in this case the logs emitted using MOZ_LOG were not going to show me an url that would have made that immediately visible):

[Parent 740038: Main Thread]: D/SHIPBFCache nsSHistory::EvictOutOfRangeContentViewers 4
[Parent 740038: Main Thread]: D/SHIPBFCache nsSHistory::EvictContentViewerForEntry destroying an nsFrameLoader.
***** DEBUG destroying extension context: contextId 274877907146 tabId 4 *****

Based on these logs, the extension page is being evicted as expected under SHIP (SessionHistory In Parent) through SessionHistoryEntry::SetFrameLoader => nsSHistory::EvictOutOfRangeContentViewers.

Based on what observed (and described in comment 3) it seems to me that:

  • any DOM API should be already handling the BFCache appropriately (either blocking the page entering the BFCache when appropriate or making sure it doesn't keep doing work while the page is in the BFCache, like playing audio, and if in some cases we are not handing some case as appropriate it would not be a bug specific to "extension page")

  • the WebExtension APIs internals should take into account that an extension page may end up into the BFCache (and preventing an extension page from ever entering the BFCache doesn't seem the right way)

I think that it would still be appropriate to confirm what described in comment 3 and the statements above with an engineer that have a more direct knowledge about Fission and SHIP.

In the meantime I'll look into turning the reduced test case into a more appropriate test case, fold it into the other patch and then consider expanding it to cover some other behaviors that may be worth covering more explicitly.

the WebExtension APIs internals should take into account that an extension page may end up into the BFCache (and preventing an extension page from ever entering the BFCache doesn't seem the right way)

The framework already attempts to account for the bfcache:

The stack trace at https://bugzilla.mozilla.org/show_bug.cgi?id=1761828#c4 suggests that it is possible for these to be called despite context.active being true. That in turn suggests that there may be an issue with the implementation of InnerWindowReference (with SHIP enabled?) at https://searchfox.org/mozilla-central/rev/88792eff309001778cb2431f2a0ed92f8f3c258a/toolkit/components/extensions/ExtensionCommon.jsm#433-502

(In reply to Rob Wu [:robwu] from comment #5)

the WebExtension APIs internals should take into account that an extension page may end up into the BFCache (and preventing an extension page from ever entering the BFCache doesn't seem the right way)

I'm definitely already aware of that, in fact the patch attached in comment 1 does currently include an explicit additional check for the isBFCache flag from inside the InnerWindowReference class, but before moving forward with an actual set of proposed changes to InnerWindowReference class (and/or the other components involved that are using it) I needed to get a better picture about where and how the decision was being made.

Blocks: 1499129
Attachment #9275766 - Attachment description: WIP: Bug 1768522 - ExtensionBaseContextChild instances should not be active after navigating it to a page running in another process under fission. → Bug 1768522 - ExtensionBaseContextChild instances should not be active after navigating it to a page running in another process under fission.
Attachment #9275941 - Attachment is obsolete: true

I've marked the separate phabricator revision that was including only the reduced test case as obsolete, the test case in that form has already served its purpose and now it has been:

  • reworked to assert the expected behaviors under fission and non-fission jobs
  • expanded to cover both same-process and cross-process navigations and more details related to the expected behaviors (included explicitly covering Bug 1499129)
  • folded into the other revision (D145919).

Bug 1499129 and the new version of the test case confirms that the WebExtensions internals should account for the extension pages to be moved into the bfcache and that it would be reasonable that the behaviors of same-process navigation under non-fission and the behaviors of a cross-process navigation under fission (and SessionHistory in Parent) to match each other.

And so I'm also marking D145919 as up for review.

Severity: -- → S2
Status: NEW → ASSIGNED
Priority: -- → P2
Pushed by luca.greco@alcacoop.it:
https://hg.mozilla.org/integration/autoland/rev/606c0a559944
ExtensionBaseContextChild instances should not be active after navigating it to a page running in another process under fission. r=robwu
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 102 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: