Closed Bug 1369436 Opened 7 years ago Closed 5 years ago

PushComponents.js is loaded too early during startup

Categories

(Core :: DOM: Push Subscriptions, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox71 --- fixed

People

(Reporter: florian, Assigned: emmamalysz)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxperf:p2])

Attachments

(3 files)

PushComponents.js is loaded during app-startup, but all it really does during app-startup is add an observer for sessionstore-windows-restored http://searchfox.org/mozilla-central/rev/1a0d9545b9805f50a70de703a3c04fc0d22e3839/dom/push/PushComponents.js#79

I think we could instead trigger this initialization from nsBrowserGlue.js' _onWindowsRestored method: http://searchfox.org/mozilla-central/rev/1a0d9545b9805f50a70de703a3c04fc0d22e3839/browser/components/nsBrowserGlue.js#1062

This would let us not load any of this code before first paint.
Maybe Kit has an opinion here.
Flags: needinfo?(kit)
Priority: -- → P2
This sounds good to me; we can initialize even later, if there's a better place to do that work than `_onWindowsRestored`. The only reason we load it at startup is so that we can open a WebSocket connection and listen for incoming push messages...but the server will buffer messages while we're not connected, so we won't miss anything by waiting.
Flags: needinfo?(kit)
(In reply to Kit Cambridge (he/him) [:kitcambridge] (UTC-7) from comment #2)
> This sounds good to me; we can initialize even later, if there's a better
> place to do that work than `_onWindowsRestored`.

Then let's start it from a requestIdleCallback (or Services.tm.mainThread.idleDispatch) callback, requested from _onWindowsRestored.

> The only reason we load it
> at startup is so that we can open a WebSocket connection and listen for
> incoming push messages...

Do we actually need the websocket if the user hasn't granted push permissions to any specific website? If not, is this something we could figure out without loading the component?

Would be nice if we could make the whole thing load lazily when receiving one of the relevant messages from a content process (similarly to what was done in bug 1358921).
(In reply to Florian Quèze [:florian] [:flo] from comment #3)
> Do we actually need the websocket if the user hasn't granted push
> permissions to any specific website? If not, is this something we could
> figure out without loading the component?

We can enumerate the `desktop-notification` permissions, using something like https://hg.mozilla.org/mozilla-central/rev/ac87d7f863e7#l4.33, and skip loading the component if not. FxA uses push, too (http://searchfox.org/mozilla-central/source/services/fxaccounts/FxAccountsPush.js; in the parent process, so I'm not sure if the approach from bug 1358921 will work here), so we'd also want to load the component if the user is signed in.

What do you think?
Flags: needinfo?(florian)
(In reply to Kit Cambridge (he/him) [:kitcambridge] (UTC-7) from comment #4)
> (In reply to Florian Quèze [:florian] [:flo] from comment #3)
> > Do we actually need the websocket if the user hasn't granted push
> > permissions to any specific website? If not, is this something we could
> > figure out without loading the component?
> 
> We can enumerate the `desktop-notification` permissions, using something
> like https://hg.mozilla.org/mozilla-central/rev/ac87d7f863e7#l4.33, and skip
> loading the component if not.

Looks reasonable enough (it would be better if nsIPermissionManager could expose a way to get an enumerator for a specific permission type... but that could be a follow-up if it turns out to be expensive for some users with many permissions set).

> FxA uses push, too
> (http://searchfox.org/mozilla-central/source/services/fxaccounts/
> FxAccountsPush.js; in the parent process, so I'm not sure if the approach
> from bug 1358921 will work here), so we'd also want to load the component if
> the user is signed in.

Can this be done with a simple Services.prefs.prefHasUserValue (or similar) call without loading Fx account modules?

Slightly off-topic for this bug, but there are currently 4 fx account modules loaded after first paint but before we handle user events: FxAccountsCommon.js FxAccountsWebChannel.jsm FxAccounts.jsm FxAccountsStorage.jsm 
Could these also be loaded lazily?
Flags: needinfo?(florian)
(In reply to Florian Quèze [:florian] [:flo] from comment #5)
> Can this be done with a simple Services.prefs.prefHasUserValue (or similar)
> call without loading Fx account modules?

We can *probably* infer this from the `identity.fxaccounts.lastSignedInUserHash` pref, or `services.sync.username` if they're signed in to Sync. (I'm not sure if it's still possible to be signed in to FxA without Sync, but I'll find out).

> Could these also be loaded lazily?

Filed bug 1369539.
Oh, hmm...actually, the push service also listens for permission changes and connection enabled changes, and expires stale push subscriptions on idle. An origin might not have the `desktop-notification` permission, but still have stale subscriptions that we want to evict. Determining this requires importing `PushDB.jsm`, so we may as well load the component and have it decide what to do. `Services.tm.mainThread.idleDispatch` from `_onWindowsRestored` seems like the best way to go.
Assignee: nobody → kit
Status: NEW → ASSIGNED
The code running when receiving the sessionstore-windows-restored notification is also causing jank.
See this profile: https://perfht.ml/2ro0Au2 (captured on the quantum reference hardware)

In addition to calling this using idle dispatch, could we reduce the amount of work done there, or split it in chunks?

The linked profile shows us first importing resource://gre/modules/PushService.jsm, and then:
Components.utils::import resource://gre/modules/PushServiceWebSocket.jsm
Components.utils::import resource://gre/modules/PushRecord.jsm
Components.utils::import resource://gre/modules/PushServiceHttp2.jsm
Components.utils::import resource://gre/modules/PushCrypto.jsm
Components.utils::import resource://gre/modules/PushDB.jsm
Components.utils::import resource://gre/modules/IndexedDBHelper.jsm

Do we need all of these modules immediately? If not, some of this should become lazy module getters. (From looking at the PushService.jsm code a little bit, it seems PushServiceHttp2.jsm will never be used if PushServiceWebSocket.jsm is usable.)
(In reply to Florian Quèze [:florian] [:flo] from comment #5)
> (In reply to Kit Cambridge (he/him) [:kitcambridge] (UTC-7) from comment #4)
> > FxA uses push, too
> > (http://searchfox.org/mozilla-central/source/services/fxaccounts/
> > FxAccountsPush.js; in the parent process, so I'm not sure if the approach
> > from bug 1358921 will work here), so we'd also want to load the component if
> > the user is signed in.
> 
> Can this be done with a simple Services.prefs.prefHasUserValue (or similar)
> call without loading Fx account modules?

That module goes out of its way to load only once a relevant notification is received via http://searchfox.org/mozilla-central/source/services/fxaccounts/FxAccountsComponents.manifest#4, so I'd expect that to co-operate in a lazier scheme.

> Slightly off-topic for this bug, but there are currently 4 fx account
> modules loaded after first paint but before we handle user events:
> FxAccountsCommon.js FxAccountsWebChannel.jsm FxAccounts.jsm
> FxAccountsStorage.jsm 
> Could these also be loaded lazily?

Florian pointed out bug 1369539 which is probably different - we should certainly be able to arrange for push to not be the earliest import of those modules.
Comment on attachment 8880192 [details]
Bug 1369436 - Alphabetize and lazily import push modules.

https://reviewboard.mozilla.org/r/151560/#review157264

::: dom/push/PushRecord.jsm:133
(Diff revision 1)
>     *
>     * @returns {Promise} A promise resolved with either the last time the user
>     *  visited the site, or `-Infinity` if the site is not in the user's history.
>     *  The time is expressed in milliseconds since Epoch.
>     */
> -  getLastVisit: Task.async(function* () {
> +  async getLastVisit() {

I bitrotted you with bug 1374282.

::: dom/push/PushService.jsm:28
(Diff revision 1)
> -const {PushDB} = Cu.import("resource://gre/modules/PushDB.jsm");
> +                                  "resource://gre/modules/PushCrypto.jsm");
>  
> -const CONNECTION_PROTOCOLS = (function() {
> +XPCOMUtils.defineLazyGetter(this, "CONNECTION_PROTOCOLS", () => {
>    if ('android' != AppConstants.MOZ_WIDGET_TOOLKIT) {
>      const {PushServiceWebSocket} = Cu.import("resource://gre/modules/PushServiceWebSocket.jsm");
>      const {PushServiceHttp2} = Cu.import("resource://gre/modules/PushServiceHttp2.jsm");

This is still importing these 2 modules at once when it's very likely that only one of the two will be used.
Attachment #8880192 - Flags: review?(florian)
Comment on attachment 8880193 [details]
Bug 1369436 - Listen for push messages once all windows have been restored.

https://reviewboard.mozilla.org/r/151562/#review157268

This patch looks good, but you'll have test failures if you don't remove this whitelist entry: http://searchfox.org/mozilla-central/rev/3291398f10dcbe192fb52e74974b172616c018aa/browser/base/content/test/performance/browser_startup.js#33

Should we blacklist this component in the "before handling user events" phase?
Attachment #8880193 - Flags: review?(florian) → review+
Comment on attachment 8880192 [details]
Bug 1369436 - Alphabetize and lazily import push modules.

https://reviewboard.mozilla.org/r/151560/#review159160

Thanks!
Attachment #8880192 - Flags: review?(florian) → review+
Android tests are failing in https://treeherder.mozilla.org/#/jobs?repo=try&revision=ce1886276b35 because we now call `PushService.ensureReady()` from `nsBrowserGlue` instead of automatically initializing the service on `app-startup`.

I think I need to call `PushService.ensureReady` from http://searchfox.org/mozilla-central/rev/a3a739de04ee6134c11546568a33dbb6a6a29907/mobile/android/chrome/content/browser.js#381 to make this work, but I'm not sure.

Alternatively, we have an `android-push-service` category that we use to initialize the push service when we receive a message. Instead of adding `PushService.ensureReady` to browser.js, should I instead do something like http://searchfox.org/mozilla-central/rev/a3a739de04ee6134c11546568a33dbb6a6a29907/mobile/android/base/java/org/mozilla/gecko/push/PushService.java#255-257 in `GeckoApp#onStartup`? Is it possible that might race if we receive a push message while Gecko is starting up?

Jim, can you suggest the best approach to use here? Thanks!
Flags: needinfo?(nchen)
Do you need to call `PushService.ensureReady` when we start Gecko in the background due to a background push message? In that case `browser.js` is _not_ loaded because the UI is not shown, so you may have to call `ensureReady` in the `android-push-service` handler if needed.
Flags: needinfo?(nchen)
(In reply to Jim Chen [:jchen] [:darchons] from comment #21)
> Do you need to call `PushService.ensureReady` when we start Gecko in the
> background due to a background push message?

I do, and I think the existing `android-push-service` handler takes care of that.

I'm wondering about the foreground case, though: should I add an `ensureReady` call to `browser.js`, or change `GeckoApp#onAppStartup` to trigger `android-push-service` via `getIntentToCreateServices`? Is one better than the other?
Flags: needinfo?(nchen)
Both sound good to me. Maybe the browser.js way is simpler?

And as a third option, what about calling `GeckoThread.createServices("android-push-service", null);` from `PushService.onStartup`? That keeps everything inside `PushService`. We'd end up getting `android-push-service` twice when starting in the background, but I think that should be harmless.

Another consideration is, for GeckoView (i.e. running Gecko in third-party apps), `browser.js/GeckoApp` are not used at all. We'd have to initialize `PushService` some other way for GeckoView, but we don't have to worry about that for now.
Flags: needinfo?(nchen)
Hmm. I tried both the browser.js way, and your suggestion to call `GeckoThread.createServices("android-push-service", null)` from Java...and the push service still doesn't start: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2e9ba2bc6da0d64625f2171c81c1355d9de42288

I think the dom/cache, dom/fetch, and dom/serviceworkers tests are timing out because we unsubscribe from push when unregistering a service worker. The push service hasn't initialized, so the test hangs waiting for `nsIPushService.unsubscribe` to return.

Is there something different about how the mochitests are run that causes them to bypass browser.js/GeckoApp?
Flags: needinfo?(nchen)
Mochitests are run in a normal Fennec session, so browser.js/GeckoApp should behave normally. I see a lot of these lines in all of the logs. Could they be related?

> [JavaScript Error: "Attempt to stop observing a preference "connection.enabled" that's not being observed" {file: "resource://gre/modules/Preferences.jsm" line: 318}]
>   Preferences.ignore@resource://gre/modules/Preferences.jsm:318:5
>   _stopObservers@resource://gre/modules/PushService.jsm:604:5
>   _stopService@resource://gre/modules/PushService.jsm:568:5
>   _changeServerURL@resource://gre/modules/PushService.jsm:413:16
>   _shutdownService@resource://gre/modules/PushService.jsm:613:28
>   uninit/<@resource://gre/modules/PushService.jsm:629:42
> 
> console.error: PushService:
> 
> [JavaScript Error: "stateChangeProcessEnqueue: Error transitioning state [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://gre/modules/PushService.jsm :: _stopObservers :: line 606"  data: no]"]
>   stateChangeProcessEnqueue: Error transitioning state
>   Message: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://gre/modules/PushService.js :: _stopObservers :: line 606"  data: no]
>   Stack:
>     _stopObservers@resource://gre/modules/PushService.jsm:606:5
>     _stopService@resource://gre/modules/PushService.jsm:568:5
>     _changeServerURL@resource://gre/modules/PushService.jsm:413:16
>     _shutdownService@resource://gre/modules/PushService.jsm:613:28
>     uninit/<@resource://gre/modules/PushService.jsm:629:42
> 
> [JavaScript Error: "Attempt to stop observing a preference "connection.enabled" that's not being observed" {file: "resource://gre/modules/Preferences.jsm" line: 318}]
>   Preferences.ignore@resource://gre/modules/Preferences.jsm:318:5
>   _stopObservers@resource://gre/modules/PushService.jsm:604:5
>   _stopService@resource://gre/modules/PushService.jsm:568:5
>   _changeServerURL@resource://gre/modules/PushService.jsm:413:16
>   _shutdownService@resource://gre/modules/PushService.jsm:613:28
>   _stateChangeProcessEnqueue/this._stateChangeProcessQueue<@resource://gre/modules/PushService.jsm:163:16
> 
> console.error: PushService:
> 
> [JavaScript Error: "stateChangeProcessEnqueue: Error shutting down service [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://gre/modules/PushService.jsm :: _stopObservers :: line 606"  data: no]"]
>   stateChangeProcessEnqueue: Error shutting down service
>   Message: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://gre/modules/PushService.jsm :: _stopObservers :: line 606"  data: no]
>   Stack:
>     _stopObservers@resource://gre/modules/PushService.jsm:606:5
>     _stopService@resource://gre/modules/PushService.jsm:568:5
>     _changeServerURL@resource://gre/modules/PushService.jsm:413:16
>     _shutdownService@resource://gre/modules/PushService.jsm:613:28
>     _stateChangeProcessEnqueue/this._stateChangeProcessQueue<@resource://gre/modules/PushService.jsm:163:16

Otherwise, I think you may have to put some logs in to see if `PushService.ensureReady` is actually called. Also, where is `PushService.ensureReady` defined? I couldn't find it in `PushComponents.js`.
Flags: needinfo?(nchen)
I missed those, thanks for pointing them out! IIUC, they show up in the log after the test times out. The "attempt to stop observing a preference" logs are warnings, and I suspect the `nsIObserverService.removeObserver` error is because we didn't finish initializing and attaching our observers...but we definitely know the module is loaded, or `uninit` wouldn't be called. Time to add some more logging.

`ensureReady` is in part 2 of this patch; I renamed it from `_handleReady`.
Whiteboard: [fxperf]
Hey Kit, is this blocked on something that's not in the bug comments? Are you still planning to work on it?
Flags: needinfo?(kit)
Whiteboard: [fxperf] → [fxperf:p2]
Hi Doug, I'm sorry this slipped off my radar! I added https://hg.mozilla.org/try/rev/aab6beab912f0fb7c43a4c41ac24f45c06c1a0cf, but I never did figure out why the service wouldn't initialize on Android, and then got sidetracked by other work. :-(

I'm not likely to get to this in the next few weeks, so please feel free to snag it if you want.
Flags: needinfo?(kit)
Just doing triage - I'll unassign you for now so it's available for anyone to pick up, and I might snag it in the near future.
Assignee: kit → nobody
Status: ASSIGNED → NEW
See Also: → 1568231
Assignee: nobody → emalysz
Attachment #9091152 - Attachment description: Bug 1369436, modernizing old patch to load PushComponents.js after startup → Bug 1369436, Load PushComponents.js after startup.

Thanks, Emma - looks like your patch has been approved! Can I presume this had a green try run with the mochitest-browser suite on our desktop platforms?

Flags: needinfo?(emalysz)
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a956d11fef6e
Load PushComponents.js after startup. r=lina
Regressions: 1522950
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6a48651180af
Load PushComponents.js after startup. r=lina,mconley

Hi Emma.
The regression mentioned above is occurring again since this bug got landed again. I filed a new bug to not reopen the fixed one. Can you take a look?

https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedJob=268388766&resultStatus=testfailed%2Cbusted%2Cexception&tochange=d6079cfa0c4ffe967dc7352b188ab3b24c487936&fromchange=bf2586dc82562136449ebe14bbc14b609c20c245

Regressions: 1583950
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71
Flags: needinfo?(emalysz)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: