Closed Bug 1260801 Opened 8 years ago Closed 8 years ago

Sync hangs fetching FxA device ID

Categories

(Firefox :: Firefox Accounts, defect, P1)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: RyanVM, Assigned: lina)

References

Details

Attachments

(1 file)

Attached file Sync log
For about the last week, Sync has been perma-spinning for me. I'm on a win64 trunk build. At rnewman's request, I'm attaching the interesting parts of a recent Sync log. Let me know if you want me to try any experimentation in general.
This is almost certainly bug 1258596 (or the closely related bug 1258595)

The interesting part here is (reformatted):

> 1459312388478	Sync.Store.Clients	WARN	failed to get fxa device id: [Exception... "App. Quitting"  nsresult: "0x80004004 (NS_ERROR_ABORT)"  location: "JS frame :: resource://services-common/async.js :: checkAppReady/onQuitApplication/Async.checkAppReady :: line 122"  data: no] Stack trace: checkAppReady/onQuitApplication/Async.checkAppReady()@resource://services-common/async.js:122
>  < waitForSyncCallback()@resource://services-common/async.js:97
>  < makeSpinningCallback/callback.wait()@resource://services-common/async.js:152
>  < createRecord()@resource://gre/modules/services-sync/engines/clients.js:441
>  < SyncEngine.prototype._createRecord()@resource://services-sync/engines.js:860
>  < SyncEngine.prototype._uploadOutgoing()@resource://services-sync/engines.js:1485
>  < SyncEngine.prototype._sync()@resource://services-sync/engines.js:1553
>  < WrappedNotify()@resource://services-sync/util.js:146
>  < Engine.prototype.sync()@resource://services-sync/engines.js:670
>  < _syncEngine()@resource://services-sync/stages/enginesync.js:213
>  < sync()@resource://services-sync/stages/enginesync.js:91
>  < onNotify()@resource://gre/modules/services-sync/service.js:1262
>  < WrappedNotify()@resource://services-sync/util.js:146
>  < WrappedLock()@resource://services-sync/util.js:101
>  < _lockedSync()@resource://gre/modules/services-sync/service.js:1252
>  < sync/<()@resource://gre/modules/services-sync/service.js:1244
>  < WrappedCatch()@resource://services-sync/util.js:75
>  < sync()@resource://gre/modules/services-sync/service.js:1232
> 

which means Sync has actually started, but we are hanging attempting to get the FxA device ID. The function in question is at https://dxr.mozilla.org/mozilla-central/rev/494289c72ba3997183e7b5beaca3e0447ecaf96d/services/fxaccounts/FxAccounts.jsm#591, which ends up doing the push registration - which means it looks alot like bug 1258596. When I hit that bug, I was actually doing a sign-in, which caused Sync to hang before starting. However, I'm confident that users striking this bug who are already logged in will see the behaviour above (ie, sync starts but hangs early when syncing the clients engine.)

I'm not going to mark this as a dupe yet as that other bug is in a different component - once that bug is fixed we should verify the issue goes away and maybe dupe at that time.
Component: Firefox Sync: Backend → FxAccounts
Depends on: 1258596
Flags: firefox-backlog+
Priority: -- → P1
Product: Cloud Services → Core
Summary: Sync appears to be perma-stuck on my desktop client → Sync hangs fetching FxA device ID
Ryan, a few questions for you, at your convenience:

1. Do you see anything mentioning "ActorsParent.cpp" in the browser console?
2. Are IDB and Push broken for web sites?
3. Are you using the profile with multiple channels? That seems to be the easiest way to trigger IDB corruption, but I'm curious if there are other ways to cause this.

Bug 1258595 won't fix the device registration, but it'll at least unblock Sync from starting.
Flags: needinfo?(ryanvm)
(In reply to Kit Cambridge [:kitcambridge] from comment #2)
> 1. Do you see anything mentioning "ActorsParent.cpp" in the browser console?
No.
> 2. Are IDB and Push broken for web sites?
https://simple-push-demo.appspot.com/ doesn't work, so I guess so.
> 3. Are you using the profile with multiple channels? That seems to be the
> easiest way to trigger IDB corruption, but I'm curious if there are other
> ways to cause this.
Nope. My regular profile is nightly-only.
Flags: needinfo?(ryanvm)
I tried applying the patch from bug 1258595 to my local build, but I'm still perma-spinning.
Deleting the storage directory of my profile doesn't appear to have done anything either. Anything else you want me to try?
I guess one small step we could make is to apply the following patch - it just has 2 dump statements (so you'll need to start with -console and ensure browser.dom.window.dump.enabled is true - or just replace them with, say, Cu.reportError() and check the browser console).

I suspect you will see the first and not the second. That will confirm the push service is the problem. Sadly it will not help us work out *why*, but it's still valuable to know for sure that the problem is related to push.

If you really want to get Sync running again in the meantime, you should be able to create a boolean pref "identity.fxaccounts.skipDeviceRegistration" set to true - but please remember to unset that pref later when we ask for more help with diagnostics :)

diff --git a/services/fxaccounts/FxAccountsPush.js b/services/fxaccounts/FxAccountsPush.js
index df98455..5b0494c 100644
--- a/services/fxaccounts/FxAccountsPush.js
+++ b/services/fxaccounts/FxAccountsPush.js
@@ -92,11 +92,13 @@ FxAccountsPushService.prototype = {
    */
   registerPushEndpoint() {
     this.log.trace("FxAccountsPush registerPushEndpoint");
+    dump("Registering endpoint\n");

     return new Promise((resolve) => {
       this.pushService.subscribe(FXA_PUSH_SCOPE_ACCOUNT_UPDATE,
         Services.scriptSecurityManager.getSystemPrincipal(),
         (result, subscription) => {
+          dump("Register callback\n");
           if (Components.isSuccessCode(result)) {
             this.log.debug("FxAccountsPush got subscription");
             resolve(subscription);
Also, could you please set "dom.push.loglevel" to "all"?
Ryan, do you have some cycles to help us debug this?
Flags: needinfo?(ryanvm)
I think this is what you want. This is what I get when I click the Sync Now button. As predicted, "Register callback" never shows up.

> PushService:register() Object { scope: "chrome://fxa-device-update", systemRecord: true, originAttributes: "" }   PushService.jsm:1086

> Registering endpoint     FxAccountsPush.js:9

I also got an "Invalid chrome URI: /" message in there, but I'm not sure where it came from or whether it's actually related or not. But I figured I'd mention it just in case.
Flags: needinfo?(ryanvm)
Also possible-related, this message showed up a little bit later:
> TypeError: this._recipeManager is null           LoginManagerParent.jsm:77:9
(In reply to Ryan VanderMeulen [:RyanVM] from comment #9)
> I think this is what you want. This is what I get when I click the Sync Now
> button. As predicted, "Register callback" never shows up.
> 
> > PushService:register() Object { scope: "chrome://fxa-device-update", systemRecord: true, originAttributes: "" }   PushService.jsm:1086

Thanks very much, Ryan! Definitely a Push issue. If you could, I'd like to see all the PushService and PushServiceWebSocket logs from the console. I take it nothing shows up after that "register" log message?

In particular, I'm interested in anything mentioning "PushService: setState()", so we can see where it's hanging when it starts up.
Flags: needinfo?(ryanvm)
Assignee: nobody → kcambridge
Sync appears to have un-busted itself. \m/? :)
Flags: needinfo?(ryanvm)
Depends on: 1264710
Depends on: 1258595
No longer depends on: 1258596
Sync is no longer hanging for me in my profile with the borked indexed db \o/

Console reports log is below, but now that Sync is no longer hanging and we have bug 1258596 on file, I think we can call this WFM.

Thanks Kit!

console.debug: PushService:
  dropExpiredRegistrations()
console.debug: PushDB:
  getAllExpired()
console.debug: PushDB:
  getAllByPushQuota()
console.error: PushService:
  stateChangeProcessEnqueue: Error transitioning state
  UnknownError
console.debug: PushService:
  changeServerURL()
console.debug: PushService:
  stopService()
console.debug: PushService:
  stopObservers()
console.debug: PushServiceWebSocket:
  shutdownWS()
console.debug: PushServiceWebSocket:
  shutdownWS()
console.debug: PushService:
  setState()
  new state
  0
  old state
  2
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
Product: Core → Firefox
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: