Closed Bug 1369841 Opened 3 years ago Closed 2 years ago

Handle runtime.onMessage/onConnect in the background page during browser startup

Categories

(WebExtensions :: General, enhancement, P2)

enhancement

Tracking

(firefox62 fixed)

RESOLVED FIXED
mozilla62
Tracking Status
firefox62 --- fixed

People

(Reporter: ianbicking, Assigned: aswan)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: triaged)

Attachments

(2 files)

We got this bug report in Screenshots: https://github.com/mozilla-services/screenshots/issues/2955

It's not that easy to reproduce, but I'm guessing it's due to a content script being loaded on a page before the background page is initialized. Since deferred startup for background pages just landed, I'm guessing that's related.

I believe extension authors will expect the background page to be loaded before any content_scripts (or implicitly expect it as we did), and so content_scripts should also be deferred.
Priority: -- → P2
Whiteboard: triaged
We do wait until the background page is loaded before we begin loading content scripts, in general. It's possible that they may load earlier in some corner cases, though, like when new content processes start in the middle of the startup cycle.
I was about to file the same bug. :)

Content script has chrome.runtime.sendMessage()
Background script has chrome.runtime.onMessage.addListener()

On loading the extension I get the following error:

> Error: Could not establish connection. Receiving end does not exist.  (unknown)

I added debugging and noticed that content scripts are injected into pages before the background scripts is run which seems like the wrong order practically and semantically. 

The "content_scripts" is set to "run_at": "document_start", if that makes any difference.
Duplicate of this bug: 1388270
(In reply to Ian Bicking (:ianb) from comment #0)

> It's not that easy to reproduce

This happens consistently on Android. Every time you install the extension, it runs the content scripts first and initializes the background page afterwards.
Yes, i have also learned this some months ago. BG is not ready and content script is running. Addon enable/disable reproduce this 100%.
Flags: needinfo?(amckay)
This is gong to be fixed over in bug 1378459, so we'll duplicate over to that.
Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(amckay)
Resolution: --- → DUPLICATE
Duplicate of bug: webextensions-startup
These are related to be sure, but not the same thing.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Duplicate of this bug: 1422268
It hasn't been clearly stated but the issue here is content scripts trying to use runtime.sendMessage() before the background page has had a chance to add a runtime.onMessage listener.  This is not just an issue for content scripts, bug 1460607 for instance is about this happening from an extension page in a sidebar.

I think we can address this during browser startup by making runtime.onMessage and runtime.onConnect into persistent events, which will ensure that we don't drop any messages sent to the background page before if has started during browser startup.  This explicitly will not handle other styles of messaging passing (e.g., from a background page to a content script, between non-background extension pages, etc.) but extension authors should not be making any assumptions about the order in which those run anyway.

Also, we may still have an issue when an extension is enabled at runtime (including when an extension is first installed).  I don't intend to try to cover that case here.
Assignee: nobody → aswan
Summary: content_scripts can run before background page is loaded → Handle runtime.onMessage() in the background page during browser startup
Summary: Handle runtime.onMessage() in the background page during browser startup → Handle runtime.onMessage/onConnect in the background page during browser startup
Blocks: 1460607
Attachment #8981739 - Flags: review?(tomica)
Comment on attachment 8981739 [details]
Bug 1369841 Make browser.runtime.onMessage and onConnect persistent

https://reviewboard.mozilla.org/r/247828/#review254242

::: toolkit/components/extensions/Extension.jsm:1453
(Diff revision 1)
>  
>    saveStartupData() {
>      if (this.dontSaveStartupData) {
>        return;
>      }
> -    AddonManagerPrivate.setStartupData(this.id, this.startupData);
> +    XPIProvider.setStartupData(this.id, this.startupData);

I'm guessing unrelated cleanup?

::: toolkit/components/extensions/parent/ext-runtime.js:202
(Diff revision 1)
> +    extension.wakeupBackground = () => {
> +      let promise = fire.wakeup();
> +      promise.then(() => { extension.wakeupBackground = undefined; });
> +      extension.wakeupBackground = () => promise;
> +      return promise;
> +    };

Isn't this getting called multiple times per extension?   Which is not necessarily a big deal, but probably means it could be simplified (the whole setup looks a bit convoluted/fragile).

::: toolkit/components/extensions/parent/ext-toolkit.js:86
(Diff revision 1)
>  
>  // browserPaintedPromise and browserStartupPromise are promises that
>  // resolve after the first browser window is painted and after browser
>  // windows have been restored, respectively.
> +// _resetStartupPromises() is accessible here for use from tests.
> +global._resetStartupPromises = () => {

ugh, why didn't the original delayed stuff need this for testing?

::: toolkit/components/extensions/test/xpcshell/test_ext_messaging_startup.js:63
(Diff revision 1)
> +  info(`Test wakeup for ${what} from an extension page`);
> +  await promiseRestartManager();
> +  await extension.awaitStartup();
> +
> +  let events = trackEvents(extension);
> +

How about asserting events were not fired here?  Probably after an `executeSoon` for good measure.

::: toolkit/components/extensions/test/xpcshell/test_ext_messaging_startup.js:91
(Diff revision 1)
> +  apiManager.global._resetStartupPromises();
> +  await promiseRestartManager();
> +  await extension.awaitStartup();
> +
> +  events = trackEvents(extension);
> +

Same here.

::: toolkit/components/extensions/test/xpcshell/test_ext_messaging_startup.js:113
(Diff revision 1)
> +  ok(true, "Background page loaded and received message from content script");
> +
> +  await page.close();
> +  await extension.unload();
> +
> +  apiManager.global._resetStartupPromises();

I think it's better to do this at the start of the test that depends on it, and not expect the world to be in a convenient state before the test.
Attachment #8981739 - Flags: review?(tomica) → review+
Comment on attachment 8981739 [details]
Bug 1369841 Make browser.runtime.onMessage and onConnect persistent

https://reviewboard.mozilla.org/r/247828/#review254242

> I'm guessing unrelated cleanup?

Not exactly, with Kris's recent async shutdown changes I started getting into situations (probably test only) where an extension called into the AddonManager setStartupData() wrapper after shutdown had begun so getProviderByName("XPIProvider") was returning null.  When setStartupData() was originally implemented, we had a (at this point rather arbitrary) abstraction barrier where things outside of toolkit/mozapps/ didn't reach into XPIProvider directly.  That barrier got breached in the mean time so this fix was much more expedient than screwing around with shutdown timing.

> Isn't this getting called multiple times per extension?   Which is not necessarily a big deal, but probably means it could be simplified (the whole setup looks a bit convoluted/fragile).

This can be called multiple times if there are multiple calls to runtime.sendMessage() / runtime.connect() before the background page starts.  As discussed on IRC, I don't have any ideas for simplifying...

> ugh, why didn't the original delayed stuff need this for testing?

Because the test added here is the first one that does multiple restarts in a single test run.

> I think it's better to do this at the start of the test that depends on it, and not expect the world to be in a convenient state before the test.

I'm torn on this, if we rely on individual test cases cleaning up, then the failure when one case fails to clean up shows up in a different place and its inconvenient to trace it back to the source.  But on the other hand, if each test case cleans up just in case a previous case didn't clean up, that also masks the cases where a case isn't properly cleaning up.
Given that this whole test file is pretty self-contained, I'm inclined to keep it this way...
Pushed by aswan@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5cd398cc6a4b
Make browser.runtime.onMessage and onConnect persistent r=zombie
Backout by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b1f676207e7b
Backed out changeset 5cd398cc6a4b for failing browser chrome at  browser/components/extensions/test/browser/browser_ext_legacy_extension_context_contentscript.js on a CLOSED TREE
This also failed devtools at devtools/client/debugger/test/mochitest/browser_dbg_sources-webext-contentscript.js and xpcshell at xpcshell.ini:toolkit/components/extensions/test/xpcshell/test_ext_legacy_extension_context.js

https://treeherder.mozilla.org/logviewer.html#?job_id=181544295&repo=autoland&lineNumber=2397

and

https://treeherder.mozilla.org/logviewer.html#?job_id=181545788&repo=autoland&lineNumber=1043
Comment on attachment 8981739 [details]
Bug 1369841 Make browser.runtime.onMessage and onConnect persistent

https://reviewboard.mozilla.org/r/247828/#review255250

::: toolkit/components/extensions/parent/ext-toolkit.js:82
(Diff revision 1)
>  
>  function makeStartupPromise(event) {
>    return ExtensionUtils.promiseObserved(event).then(() => {});
>  }
>  
>  // browserPaintedPromise and browserStartupPromise are promises that

Please remove the comment as well.

::: toolkit/components/extensions/test/xpcshell/test_ext_messaging_startup.js:80
(Diff revisions 1 - 3)
> +      }
> +      waiters.delete(msg);
> +      waiter(args);
> +    });
> +  }
> +  function awaitMessage(msg) {

Wouldn't just this work:

    function awaitMessage(msg) {
      return new Promise(resolve => {
        extension.onMessage(msg, resolve);
      });
    }

Other than timing out instead of quick exit on wrong ordering, but I don't think that's a concern.
Comment on attachment 8981739 [details]
Bug 1369841 Make browser.runtime.onMessage and onConnect persistent

https://reviewboard.mozilla.org/r/247828/#review255250

> Please remove the comment as well.

I think this is some sort of reviewboard interdiff glitch, the current patch has no changes to ext-toolkit.js

> Wouldn't just this work:
> 
>     function awaitMessage(msg) {
>       return new Promise(resolve => {
>         extension.onMessage(msg, resolve);
>       });
>     }
> 
> Other than timing out instead of quick exit on wrong ordering, but I don't think that's a concern.

Hm, yeah, the failure would be more confusing but the code would certainly be simpler...
Comment on attachment 8981739 [details]
Bug 1369841 Make browser.runtime.onMessage and onConnect persistent

https://reviewboard.mozilla.org/r/247828/#review255250

> Hm, yeah, the failure would be more confusing but the code would certainly be simpler...

oh no, this doesn't work, you can't ever call awaitMessage() on the same message twice (since you can't remove the first handler, the second call fails with "only one message handler allowed")
Try run with the latest patch passing xpcshell:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bc00b55eba2dfa7ae7a0f540b570539a623c8c40
Flags: needinfo?(aswan)
Pushed by aswan@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f6c89ca4944b
Make browser.runtime.onMessage and onConnect persistent r=zombie
https://hg.mozilla.org/mozilla-central/rev/f6c89ca4944b
Status: REOPENED → RESOLVED
Closed: 3 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Depends on: 1467136
Upon installing https://adblockplus.org/ I still get the "Error: Could not establish connection. Receiving end does not exist. include.postload.js:766:5" error message in the console. This can be easily reproduced with disable/enable on the installed extension.(60% reproducible)
Tested on Firefox 62.0a1 (20180606093533) in Windows 10 64Bit.
Is this scenario valid? Any other scenarios that I should be testing in order to validate this?
No longer depends on: 1467136
Flags: needinfo?(aswan)
Depends on: 1467136
(In reply to marius.santa from comment #27)
> Upon installing https://adblockplus.org/ I still get the "Error: Could not
> establish connection. Receiving end does not exist.
> include.postload.js:766:5" error message in the console. This can be easily
> reproduced with disable/enable on the installed extension.(60% reproducible)
> Tested on Firefox 62.0a1 (20180606093533) in Windows 10 64Bit.
> Is this scenario valid? Any other scenarios that I should be testing in
> order to validate this?

I installed adblock plus in a test profile and I don't see this message.
You say you "still get" this message.  Does that mean you were seeing it before this patch landed?
Flags: needinfo?(aswan)
This caused big perf regressions according to our *_webext tests:

== Change summary for alert #13665 (as of Tue, 05 Jun 2018 21:54:52 GMT) ==

Regressions:

450%  tp5o_webext responsiveness linux64 opt e10s stylo     1.57 -> 8.64
410%  tp5o_webext responsiveness linux64 pgo e10s stylo     1.37 -> 6.98
373%  tp5o_webext responsiveness linux64-qr opt e10s stylo  2.11 -> 9.97
321%  tp5o_webext responsiveness windows7-32 pgo e10s stylo 0.88 -> 3.70
294%  tp5o_webext responsiveness windows10-64 pgo e10s stylo0.92 -> 3.61
293%  tp5o_webext responsiveness windows10-64-qr opt e10s stylo1.06 -> 4.18
289%  tp5o_webext responsiveness windows7-32 opt e10s stylo 1.09 -> 4.22
  3%  tp5o_webext linux64 opt e10s stylo                    226.00 -> 233.34
  3%  tp5o_webext linux64-qr opt e10s stylo                 236.87 -> 244.49

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=13665
Bug 1467136 seems to have resolved all of them. In case I spot something fishy, I'll let you know.
This gif was made using Firefox 62.0a1 (20180607100059) and addblocker plus from amo, latest version, the error does not always appear, you need to disable/enable it a few times before you can see it.
Flags: needinfo?(aswan)
You didn't answer my question, did something change recently or was this happening even before this bug landed?
In any case, is anything actually broken or is this just about a console message?
Flags: needinfo?(aswan)
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #30)
> Bug 1467136 seems to have resolved all of them. In case I spot something
> fishy, I'll let you know.

And here is the confirmation:

== Change summary for alert #13680 (as of Wed, 06 Jun 2018 19:19:13 GMT) ==

Improvements:

 81%  tp5o_webext responsiveness linux64 opt e10s stylo     8.61 -> 1.61
 80%  tp5o_webext responsiveness linux64 pgo e10s stylo     6.96 -> 1.38
 79%  tp5o_webext responsiveness linux64-qr opt e10s stylo  10.00 -> 2.09
 77%  tp5o_webext responsiveness windows7-32 pgo e10s stylo 3.78 -> 0.87
 75%  tp5o_webext responsiveness windows10-64-qr opt e10s stylo4.19 -> 1.05
 75%  tp5o_webext responsiveness windows10-64 pgo e10s stylo3.61 -> 0.92
 74%  tp5o_webext responsiveness windows7-32 opt e10s stylo 4.26 -> 1.09
 74%  tp5o_webext responsiveness windows10-64 opt e10s stylo4.03 -> 1.05
  3%  tp5o_webext linux64 pgo e10s stylo                    215.07 -> 208.15
  2%  tp5o_webext linux64-qr opt e10s stylo                 244.39 -> 239.00

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=13680
Considering https://bugzilla.mozilla.org/show_bug.cgi?id=1369841#c2 and other related bugs to this issue, was using adblocker as a way of testing what was fixed in the bug actually landed. It's just the console message, but that might be a symptom of something else.
I am able to reproduce this issue before the bug landed, reproduced in Firefox 59.0a1 (20171212220056).
If this is unrelated to the fix please provide steps or an extension to reproduce it.
Flags: needinfo?(aswan)
(In reply to marius.santa from comment #34)
> Considering https://bugzilla.mozilla.org/show_bug.cgi?id=1369841#c2 and
> other related bugs to this issue, was using adblocker as a way of testing
> what was fixed in the bug actually landed. It's just the console message,
> but that might be a symptom of something else.

Okay, but if the error was occurring before this bug or any related bugs landed then it is unrelated.

> If this is unrelated to the fix please provide steps or an extension to
> reproduce it.

The low-level behavior handled by this bug is covered by automated tests, it doesn't need manual testing.

I think the area where we could use manual testing here is, much like bug 1447551, testing for any regressions in extensions' behavior during browser startup on Nightly.
Flags: needinfo?(aswan)
Flags: qe-verify-
Depends on: 1468040
Product: Toolkit → WebExtensions
This is marked as “RESOLVED FIXED in Firefox 62” but I can reproduce it in Firefox Nightly 63.0a1 (2018-09-01) (64-bit).

Basically, the example here does not work: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/runtime/connect#Examples

Here’s a minimal extension demonstrating the problem: https://github.com/lydell/webext-message-issue

The issue is present for both browser.runtime.onMessage and browser.runtime.onConnect.

For me, content scripts for already existing tabs are run before background scripts 100% of the time and messaging fails until the background scripts are loaded.

Am I missing something? Should I open a new issue?
Hi Simon. Creating a new issue sounds good, and linking this one to it. 
Hope this helps.
Flags: needinfo?(simon.lydell)
Thanks! I found bug 1474727 which is open and seems to exactly what I’m looking for. I added a comment over there: bug 1474727 comment 3.
Flags: needinfo?(simon.lydell)
You need to log in before you can comment on or make changes to this bug.