Open Bug 1370368 Opened 7 years ago Updated 2 years ago

port.onDisconnect not fired on page navigation

Categories

(WebExtensions :: General, defect, P3)

defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: eejdoowad, Assigned: robwu)

Details

(Whiteboard: triaged)

Attachments

(4 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36

Steps to reproduce:

Install this extension: https://github.com/lusakasa/port-disconnect-bug-firefox

Each content script establishes a port to the background page. The background page listens for the port.onDisconnect event.

Open one tab and navigate to different pages. Then close the tab.


Actual results:

port.onDisconnect never gets fired for navigation events. Only closing the tab triggers port.onDisconnect.


Expected results:

Every time the user navigates to a new page, a port.onDisconnect should be fired.
Summary: port.onDisconnect not fired on page navigaation → port.onDisconnect not fired on page navigation
Component: Untriaged → WebExtensions: General
Product: Firefox → Toolkit
Priority: -- → P3
Whiteboard: triaged
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Firefox has a concept of backforward cache - https://developer.mozilla.org/en-US/Firefox/Releases/1.5/Using_Firefox_1.5_caching

The bfcache stores the page state (and scripts) and allows it to be restored quickly when the user navigates back/forwards.

The port is not really in a truly disconnected state (it will be revived when the user navigates back to the page), nor in a fully connected state (a message sent to the port won't trigger onMessage).

Content scripts can use the pagehide/pageshow events to suspend/resume their logic.For instance, if I add the pagehide event to the test case that you've provided, and call port.postMessage from the content script, then the background page will print the message: https://pastebin.mozilla.org/9032633


This bug could be addressed by disconnecting the port upon pagehide.
Assignee: nobody → rob
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: DUPLICATE → ---
Thanks for the info. It's really informative.I had been listening for the beforeunlaod event as a workaround, but it prevents caching. The pagehide event seems like a better alternative.
Comment on attachment 8979549 [details]
Bug 1370368 - Disconnect port upon pagehide

https://reviewboard.mozilla.org/r/245694/#review251950

Looks good, just one question.

::: toolkit/components/extensions/ExtensionChild.jsm:285
(Diff revision 1)
> +    let onPageHide = (event) => {
> +      if (event.persisted) {
> +        // If the page is moved to the bfcache, then scripts in the page will be
> +        // suspended and the port becomes unreachable from the extension's
> +        // perspective. Explicitly disconnect the port when that happens.
> +        this.disconnect();

Does the content script side receive this disconnect event, or is it left in an undefined state if the page is reloaded?  This probably needs a test as well.

::: toolkit/components/extensions/test/xpcshell/test_ext_port_disconnect_upon_navigation.js:29
(Diff revision 1)
> +      });
> +    });
> +    browser.test.sendMessage("bg-ready");
> +  }
> +  function contentScript() {
> +    browser.test.sendMessage("content-script-ready");

This should probably be at the end.

::: toolkit/components/extensions/test/xpcshell/test_ext_port_disconnect_upon_navigation.js:34
(Diff revision 1)
> +    browser.test.sendMessage("content-script-ready");
> +    let port = browser.runtime.connect();
> +    port.onDisconnect.addListener(() => {
> +      // The background page has an onConnect listener and never disconnects
> +      // the port.  The port is disconnected when the page unloads, but that
> +      // shouldn't trigger the port.onDisconnect event.

Ok, there it is.  Can we actually try to send the disconnect event?  That seems better to me if we can.
Attachment #8979549 - Flags: review?(tomica)
Comment on attachment 8979549 [details]
Bug 1370368 - Disconnect port upon pagehide

https://reviewboard.mozilla.org/r/245694/#review251950

> Ok, there it is.  Can we actually try to send the disconnect event?  That seems better to me if we can.

That sounds reasonable: In some sense, the port is not disconnected by either end, so both ends should receive the onDisconnect event.

When?
- Around pagehide? No, because that will cause issues with extensions who want to keep a port open, and immediately reconnect on disconnect.
- At pageshow? Maybe. That would at least make the keep-port-alive-forever use case possible. I have implemented this in a separate commit (together with a test that checks that onDisconnect is not fired if disconnect() is called.

Should the `port.error` property be set to a special value when disconnected under these circumstances? For now I don't, but if you have a different opinion I'm all ears.
Canceling reviews here after an irc discussion with Rob and Kris (summary below).  Might want to discuss this at SF in a couple of weeks.

0) Without us doing anything, a background page trying to send a message to an "open" port would get an exception, even if it never received the onDisconnect event.  Content scripts can currently work around this by disconnecting ports manually onPageHide, and reconnecting onPageShow, but few are aware BFcache is even a thing, especially those ported from Chrome.

1) In short, because the effects of BFcache are observable from the outside, the "correct" solution here would be to prevent content pages with content scripts and open Ports from going to BFcache.  Since a bunch of popular extensions do this for every page, it would basically mean disabling BFcache for a large part of our users, and we don't want to do that.

2) In patches above, Rob and I propose an approach that would disconnect the port when the page goes into BFcache, but Kris noted that could still leave content scripts in an unexpected "broken" state if they didn't anticipate the port disconnecting (which is probably a common thing).

3) Other possible approaches could be queuing messages for those ports, or even a new event Port.onFrozen, but none of those are ideal.


If we can't find a solution we are happy with, we should at least throw a clear message when background page tries to send a message to a content script in BFcache, and document the whole situation clearly on MDN.
Attachment #8979549 - Flags: review?(tomica)
Attachment #8979930 - Flags: review?(tomica)
Attachment #8979931 - Flags: review?(tomica)
(In reply to Tomislav Jovanovic :zombie from comment #10)
> 0) Without us doing anything, a background page trying to send a message to
> an "open" port would get an exception, even if it never received the
> onDisconnect event.

In most cases, there is no error; and there is never an observable error.

Messages to recipients in the bfcache are silently dropped:
https://searchfox.org/mozilla-central/source/toolkit/components/extensions/ExtensionChild.jsm#257-259

In the rare case that a message was received right before transitioning to the bfcache, the following message is printed to the console: "Port.onMessage event fired while context is inactive."

https://searchfox.org/mozilla-central/rev/5a744713370ec47969595e369fd5125f123e6d24/toolkit/components/extensions/ExtensionChild.jsm#194-200
https://searchfox.org/mozilla-central/rev/5a744713370ec47969595e369fd5125f123e6d24/toolkit/components/extensions/ExtensionCommon.jsm#1885,1888-1889,1915-1917


> 1) In short, because the effects of BFcache are observable from the outside,
> the "correct" solution here would be to prevent content pages with content
> scripts and open Ports from going to BFcache.  Since a bunch of popular
> extensions do this for every page, it would basically mean disabling BFcache
> for a large part of our users, and we don't want to do that.

Indeed, we don't want this. Below I described a weaker version that does not break the bfcache for all pages, but I think that even the weaker version is still not wanted.


> 2) In patches above, Rob and I propose an approach that would disconnect the
> port when the page goes into BFcache, but Kris noted that could still leave
> content scripts in an unexpected "broken" state if they didn't anticipate
> the port disconnecting (which is probably a common thing).
>
> 3) Other possible approaches could be queuing messages for those ports, or
> even a new event Port.onFrozen, but none of those are ideal.

There shouldn't be a need for a new event, since extensions have all information they need. Extension developers just need to be aware of them (easier said than done).

Queuing is not ideal, because extensions have a reasonable expectation that a message will be delivered soonish; ping-pong should be fast.


The MessageChannel web API (https://developer.mozilla.org/en-US/docs/Web/API/MessageChannel) experiences a similar issue. It is handled as follows:
- The channel stays alive when a page goes into a bfcache,
- When the other end sends a message, the page evicted from the bfcache.

If we consider this to be a rare event, then we can implement the similar thing. The downside of this is that well-meaning extensions cannot 100% avoid such evictions, because there is a possible race:
- page port: pagehide event -> send message
- bg port: port.postMessage
- bg port.onMessage: receive pagehide - extension intends to behave well and stops sending messages.
- page port: the in-flight port.postMessage is received, and the page is evicted from the bfcache.

Therefore we should NEVER evict the page from the bfcache in response to port events/messages.


A better balance between compatibility and API usability is:
- unchanged: The extension port stays alive when a page enters the bfcache.
- new behavior: When port.postMessage is used, the port is disconnected with an appropriate error message.
- When the page is revived from the bfcache, the port.onDisconnect notification is fired too, with an appropriate error message.


To implement this implicit disconnect, we should account for other recipients of the port. If we don't, and automatically send an Extension:Port:Disconnect message when a context is not active, then extensions would not be able to open an extension port. Bug 1465514 is a way to fix this.
(an alternative implementation without depending on other bugs is to attach an expando to the received message and queue a microtask to send an Extension:Port:Disconnect if none of the Extension:Port:Connect/Message handlers have handled the message; this only works because currently our implementation synchronously processes the Extension:Port:Connect/Message messages).
Product: Toolkit → WebExtensions
The current situation when page with port ("pagePort") connected to the other end ("otherPort") moves into bfcache is:
- Extension doesn't know that the page has navigated (because otherPort.onDisconnect is not fired - per bug report).
- Calling otherPort.postMessage results in silently dropping messages.
- After navigating back, pagePort/otherPort can be used again.

Current patch:
- Extension is made aware of the navigation by otherPort.onDisconnect.
- Calling otherPort.postMessage throws error.
- After navigating back, pagePort.onDisconnect is triggered and pagePort/otherPort is unusable.

Proposal from end of comment 11:
- Extension doesn't know that the page has navigated.
- Calling otherPort.postMessage ultimately triggers the otherPort.onDisconnet event
- After navigating back:
  - If extension did not call otherPort.postMessage, then pagePort/otherPort can be used again.
  - If extension did call otherPort.postMessage, then pagePort/otherPort can be used.

Please let me know if you approve the revised proposal. Note that the proposal aligns with your comment 10:

> If we can't find a solution we are happy with, we should at least throw a clear message when background page tries to send a message to a content script in BFcache, and document the whole situation clearly on MDN.

Except I don't throw an error, but dispatch the onDisconnect event (to rule out the race condition where the recipient navigates after port.postMessage is called). Without bfcache, onDisconnect would also be called.
Flags: needinfo?(tomica)
With the revised scope in bug 12, here is a test case for the bug (the original test case from the OP doesn't necessarily work because it doesn't call postMessage from the background).

Reproduction steps of the problem
1. Load extension
2. Open example.com
3. Click on the "Click here to navigate elsewhere" button.
4. Look at the global JS console (Ctrl-Shift-J)

Expected (this happens in Chrome):
[0] Background: Received port
[0] Background: Got message: content page is about to pagehide
[0] Background port disconnected. port.error=undefined  runtime.lastError=(no lastError)
[0] Background: port.postMessage threw an error, as expected

Actual (Firefox 61):
[0] Background: Received port
[0] Background: Got message: content page is about to pagehide
[0] Called port.postMessage to unloaded page
(the last line should have been "Background port disconnected." and "Background: port.postMessage threw an error")


Firefox's expected result can also be seen by replacing step 3 with:
3. Close the tab.

Expected (Firefox 61, close tab instead of navigate):
[0] Background: Received port
[0] Background: Got message: content page is about to pagehide
[0] Background port disconnected. port.error=null  runtime.lastError=(no lastError)
[0] Background: port.postMessage threw an error, as expected
(In reply to Rob Wu [:robwu] from comment #12)
>   - If extension did call otherPort.postMessage, then pagePort/otherPort can be used.

Provided that's "cannot be used", I like this -- a pragmatic approach that seems least bad.
Flags: needinfo?(tomica)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: