Closed Bug 1258595 Opened 8 years ago Closed 8 years ago

Shut down the Push service if errors occur at startup

Categories

(Core :: DOM: Notifications, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: lina, Assigned: lina)

References

Details

(Whiteboard: btpp-active)

Attachments

(2 files)

It's currently possible for us to deadlock on errors during initialization. If this happens, we should shut down the Push service and reject pending requests with an error, instead of letting them hang indefinitely.
See Also: → 1258596
Comment on attachment 8733172 [details]
MozReview Request: Bug 1258595 - Shut down the Push service if errors occur at startup. r=wchen

https://reviewboard.mozilla.org/r/41611/#review38363

::: dom/push/PushService.jsm:593
(Diff revision 1)
>  
>      this._updateQuotaTimeouts.forEach((timeoutID) => clearTimeout(timeoutID));
>      this._updateQuotaTimeouts.clear();
>  
> +    if (this._notifyActivated) {
> +      this._notifyActivated.reject(new Error("Push service shutting down"));

Are you running into a scenario where this code is necessary? It looks like changing state to PUSH_SERVICE_UNINIT should reject \_notifyAcivated. If we are shutting down without changing state then it would make me suspect that something has gone wrong.
Attachment #8733172 - Flags: review?(wchen)
(In reply to William Chen [:wchen] from comment #2)
> Are you running into a scenario where this code is necessary? It looks like
> changing state to PUSH_SERVICE_UNINIT should reject \_notifyAcivated. If we
> are shutting down without changing state then it would make me suspect that
> something has gone wrong.

Sorry, I've been meaning to circle back to this. The test in this patch runs into that: we transition from PUSH_SERVICE_UNINIT to PUSH_SERVICE_ACTIVATING, and then back to PUSH_SERVICE_UNINIT when IDB returns an error.

I think the fix is to swap `this._setState(PUSH_SERVICE_UNINIT)` and `this._changeServerURL("", UNINIT_EVENT)` in `PushService.uninit`, because `_changeServerURL` calls `_stopService` synchronously, and we still have `_notifyActivated` at that point.

markh also suggested making this sticky, so that we don't try to restart the service on `register`, etc. if it's permanently broken. I'll address that in a new patch.
Comment on attachment 8733172 [details]
MozReview Request: Bug 1258595 - Shut down the Push service if errors occur at startup. r=wchen

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/41611/diff/1-2/
Attachment #8733172 - Flags: review?(wchen)
Comment on attachment 8733172 [details]
MozReview Request: Bug 1258595 - Shut down the Push service if errors occur at startup. r=wchen

https://reviewboard.mozilla.org/r/41611/#review41453

::: dom/push/PushService.jsm:220
(Diff revisions 1 - 2)
>          // PushService was not in the offline state, but got notification to
>          // go online (a offline notification has not been sent).
>          // Disconnect first.
>          this._service.disconnect();
>        }
> -      this.getAllUnexpired()
> +      return this.getAllUnexpired()

We should make this method always returns a promise.

::: dom/push/PushService.jsm:240
(Diff revisions 1 - 2)
>          this._state != PUSH_SERVICE_ACTIVATING) {
>        return;
>      }
>  
>      if (enabled) {
> -      this._changeStateOfflineEvent(Services.io.offline, true);
> +      return this._changeStateOfflineEvent(Services.io.offline, true);

This method should also always return a promise.

::: dom/push/PushService.jsm:411
(Diff revisions 1 - 2)
>          let [service, uri] = this._findService(serverURI);
>          if (!service) {
>            this._setState(PUSH_SERVICE_INIT);
>            return Promise.resolve();
>          }
> -        return this._startService(service, uri)
> +        return this._startService(service, uri, options)

We can remove the default argument from \_startService now that all of the callers pass in options.

::: dom/push/PushService.jsm:636
(Diff revisions 1 - 2)
>      prefs.ignore("serverURL", this);
>      Services.obs.removeObserver(this, "quit-application");
>  
>      this._stateChangeProcessEnqueue(_ =>
>        {
> -        var p = this._changeServerURL("", UNINIT_EVENT);
> +        var p = this._shutdownService();

Move the console.debug messages into _shutdownService and just return this._shutdownService() here.
Attachment #8733172 - Flags: review?(wchen) → review+
backed out for test failures like  https://treeherder.mozilla.org/logviewer.html#?job_id=8501165&repo=fx-team
Flags: needinfo?(kcambridge)
Whiteboard: btpp-active
Review commit: https://reviewboard.mozilla.org/r/45793/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/45793/
Attachment #8733172 - Attachment description: MozReview Request: Bug 1258595 - Shut down the Push service if errors occur at startup. r?wchen → MozReview Request: Bug 1258595 - Shut down the Push service if errors occur at startup. r=wchen
Attachment #8740481 - Flags: review?(wchen)
Comment on attachment 8733172 [details]
MozReview Request: Bug 1258595 - Shut down the Push service if errors occur at startup. r=wchen

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/41611/diff/2-3/
OK, I think I get what's going on.

https://treeherder.mozilla.org/logviewer.html#?job_id=19148268&repo=try suggests we aren't waiting for the service to shut down from the previous test. There's a race where the child sends `socket-teardown`, the parent restarts the service in the background, and the child starts the next test. When the test calls `pushManager.subscribe()`, the call is rejected (https://dxr.mozilla.org/mozilla-central/rev/21bf1af375c1fa8565ae3bb2e89bd1a0809363d4/dom/push/PushService.jsm#996-1002).

The "onRegisterError: Called without valid error message! Error {}" logging isn't helpful (bug 1263747), but I suspect that's the error we're running into, given that I don't see any "PushServiceWebSocket" log messages after `PushService.register`.

We may want to make transitioning between services more reliable. For now, this patch does the trick. https://treeherder.mozilla.org/#/jobs?repo=try&revision=6ff5628fb297 looks good.
Flags: needinfo?(kcambridge)
Attachment #8740481 - Flags: review?(wchen)
Comment on attachment 8740481 [details]
MozReview Request: Bug 1258595 - Wait for the Push service to shut down between tests. r?wchen

https://reviewboard.mozilla.org/r/45793/#review42645

::: dom/push/PushService.jsm:1004
(Diff revision 1)
>        return Promise.reject(new Error("Push service offline"));
>      }
> +    // Ensure the backend is ready. `getByPageRecord` already checks this, but
> +    // we need to check again here in case the service was restarted in the
> +    // meantime.
> +    return this._checkActivated().then(_ => {

Instead of doing this, can we clear _pendingRegisterRequest when the service gets restarted and add log a warning (saying that the server was restarted with pending register requests) when we do so?
Comment on attachment 8740481 [details]
MozReview Request: Bug 1258595 - Wait for the Push service to shut down between tests. r?wchen

https://reviewboard.mozilla.org/r/45793/#review42721
Attachment #8740481 - Flags: review+
https://reviewboard.mozilla.org/r/45793/#review42645

> Instead of doing this, can we clear _pendingRegisterRequest when the service gets restarted and add log a warning (saying that the server was restarted with pending register requests) when we do so?

Nevermind, pending registration requests don't have state dependant on the activation of the service so we don't need to clear them.
https://hg.mozilla.org/mozilla-central/rev/915a3ac4e563
https://hg.mozilla.org/mozilla-central/rev/1085ea32229e
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Blocks: 1260801
You need to log in before you can comment on or make changes to this bug.