Closed Bug 1052549 Opened 10 years ago Closed 8 years ago

Sync error bar appearing with "Unknown error" in Nightly

Categories

(Firefox :: Sync, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: ckarlof, Unassigned)

Details

(Whiteboard: [qa+])

Attachments

(3 files, 1 obsolete file)

Attached image Sync error bar
On Nightly 34.0a1 (2014-08-12) after startup: 1407861919954 Sync.Service INFO Loading Weave 1.36.0 1407861919956 Sync.Engine.Clients DEBUG Engine initialized 1407861919956 Sync.Engine.Clients DEBUG Resetting clients last sync time 1407861919960 Sync.Engine.Bookmarks DEBUG Engine initialized 1407861919962 Sync.Engine.Forms DEBUG Engine initialized 1407861919965 Sync.Engine.History DEBUG Engine initialized 1407861919967 Sync.Engine.Passwords DEBUG Engine initialized 1407861919969 Sync.Engine.Prefs DEBUG Engine initialized 1407861919971 Sync.Engine.Tabs DEBUG Engine initialized 1407861919972 Sync.Engine.Tabs DEBUG Resetting tabs last sync time 1407861919976 Sync.Engine.Addons DEBUG Engine initialized 1407861919977 Sync.Service INFO Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:34.0) Gecko/20100101 Firefox/34.0 1407861919979 Sync.SyncScheduler DEBUG Clearing sync triggers and the global score. 1407861919981 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407861919981 Sync.Service DEBUG Caching URLs under storage user base: https://sync-29-us-west-2.sync.services.mozilla.com/1.5/3512922/ 1407861919981 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407861919982 Sync.AddonsReconciler INFO Registering as Add-on Manager listener. 1407861919982 Sync.AddonsReconciler DEBUG Adding change listener. 1407861919982 Sync.Tracker.History INFO Adding Places observer. 1407861919986 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407861919988 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407861919988 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407861919995 FirefoxAccounts INFO returning partial account data as the login manager is locked. 1407861925986 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407861925986 Sync.Service DEBUG User-Agent: Firefox/34.0a1 FxSync/1.36.0.20140812030202. 1407861925986 Sync.Service INFO Starting sync at 2014-08-12 09:45:25 1407861925986 Sync.Service DEBUG In sync: should login. 1407861925987 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407861925987 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407861925988 FirefoxAccounts INFO returning partial account data as the login manager is locked. 1407861925988 Sync.BrowserIDManager INFO Waiting for user to be verified. 1407861925988 FirefoxAccounts DEBUG already verified 1407861925989 Sync.BrowserIDManager INFO Starting fetch for key bundle. 1407861925989 Sync.BrowserIDManager INFO _fetchTokenForUser has no keys to use. 1407861925989 Sync.BrowserIDManager INFO Background fetch for key bundle done 1407861925989 Sync.Status DEBUG Status.login: success.login => success.login 1407861925989 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407861925989 Sync.Service INFO Logging in the user. 1407861925989 Sync.Service DEBUG Caching URLs under storage user base: https://sync-29-us-west-2.sync.services.mozilla.com/1.5/3512922/ 1407861926115 Sync.SyncScheduler DEBUG Next sync in 3600000 ms. 1407861932288 Sync.Service DEBUG Fetching unlocked auth state returned success.status_ok 1407861932289 Sync.BrowserIDManager INFO Fetching assertion and token from: https://token.services.mozilla.com/1.0/sync/1.5 1407861932289 FirefoxAccounts DEBUG already verified 1407861932289 Sync.BrowserIDManager DEBUG Getting an assertion 1407861932289 FirefoxAccounts DEBUG enter getAssertion() 1407861932321 FirefoxAccounts DEBUG got keyPair 1407861932321 FirefoxAccounts DEBUG getCertificateSigned: true true 1407861932887 FirefoxAccounts DEBUG getCertificate got a new one: true 1407861932888 FirefoxAccounts DEBUG getAssertionFromCert 1407861932889 FirefoxAccounts DEBUG getAssertionFromCert returning signed: true 1407861932889 Sync.BrowserIDManager DEBUG Getting a token 1407861933063 Sync.BrowserIDManager DEBUG Successfully got a sync token 1407861933064 Sync.BulkKeyBundle INFO BulkKeyBundle being created for undefined 1407861933201 Sync.Resource DEBUG mesg: GET success 200 https://sync-29-us-west-2.sync.services.mozilla.com/1.5/3512922/info/collections 1407861933201 Sync.Resource DEBUG GET success 200 https://sync-29-us-west-2.sync.services.mozilla.com/1.5/3512922/info/collections 1407861933201 Sync.Service DEBUG Fetching global metadata record 1407861933202 Sync.BrowserIDManager DEBUG _ensureValidToken already has one 1407861933238 Sync.Resource DEBUG mesg: GET success 200 https://sync-29-us-west-2.sync.services.mozilla.com/1.5/3512922/storage/meta/global 1407861933238 Sync.Resource DEBUG GET success 200 https://sync-29-us-west-2.sync.services.mozilla.com/1.5/3512922/storage/meta/global 1407861933238 Sync.Service DEBUG Clearing cached meta record. metaModified is undefined, setting to 1406831703.82 1407861933239 Sync.BrowserIDManager DEBUG _ensureValidToken already has one 1407861933278 Sync.Resource DEBUG mesg: GET success 200 https://sync-29-us-west-2.sync.services.mozilla.com/1.5/3512922/storage/meta/global 1407861933278 Sync.Resource DEBUG GET success 200 https://sync-29-us-west-2.sync.services.mozilla.com/1.5/3512922/storage/meta/global 1407861933278 Sync.Service DEBUG Weave Version: 1.36.0 Local Storage: 5 Remote Storage: 5 1407861933279 Sync.Service INFO Sync key is up-to-date: no need to upgrade. 1407861933279 Sync.Service DEBUG Fetching and verifying -- or generating -- symmetric keys. 1407861933279 Sync.Service INFO Testing info/collections: {"tabs":1407804414.5,"clients":1407804414.38,"crypto":1404238256.83,"forms":1404238264.2,"meta":1406831703.82,"bookmarks":1404238262.33,"prefs":1406591151.48,"history":1406828099.27} 1407861933279 Sync.CollectionKeyManager INFO Testing for updateNeeded. Last modified: 0 1407861933279 Sync.Service INFO collection keys reports that a key update is needed. 1407861933280 Sync.BrowserIDManager DEBUG _ensureValidToken already has one 1407861935644 Sync.Tracker.History DEBUG Saving changed IDs to history 1407861939644 Sync.Tracker.History DEBUG Saving changed IDs to history 1407861958416 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1407862233132 Sync.Service WARN Got exception "Error: Aborting due to channel inactivity." fetching cryptoKeys. 1407862233132 Sync.Status DEBUG Status.login: success.login => error.login.failed 1407862233132 Sync.Status DEBUG Status.service: success.status_ok => error.login.failed 1407862233132 Sync.Service WARN Failed to fetch symmetric keys. Failing remote setup. 1407862233132 Sync.Service WARN Remote setup failed. 1407862233134 Sync.SyncScheduler DEBUG Clearing sync triggers and the global score. 1407862233135 Sync.SyncScheduler DEBUG Next sync in 3600000 ms.
> 1407861958416 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok > 1407862233132 Sync.Service WARN Got exception "Error: Aborting due to channel inactivity." fetching cryptoKeys. > 1407862233132 Sync.Status DEBUG Status.login: success.login => error.login.failed > 1407862233132 Sync.Status DEBUG Status.service: success.status_ok => error.login.failed > 1407862233132 Sync.Service WARN Failed to fetch symmetric keys. Failing remote setup. > 1407862233132 Sync.Service WARN Remote setup failed. This looks like the important bit. Investigating...
This was also one of the messaged found in Bug 1047700.
This could be a regression of Bug 624436. Gavin suggested that it may have been introduced by resolution of Bug 237623.
I might have triggered this sync manually as well, which I believe would show the error bar during a network timeout. :(
Whiteboard: [qa+]
The timestamps in the logs show pretty much exactly 5 minutes between going to check the keys and receiving the error, so this is almost certainly the explicit 5-minute timeout that sync enforces upon its requests, and is not coming from a lower layer of the network stack. I suspect that the issue here is simply that the error occurs during initial crypto-keys fetch, rather than in the sync process proper, so it's not being checked correctly: * The error bar is suppressed if the failure state is LOGIN_FAILED_NETWORK_ERROR * The logic to translate this NS_ERROR_NET_TIMEOUT instance into a LOGIN_FAILED_NETWORK_ERROR state lives in sync/modules/policies:checkServerError * The code that catches and logs this error does not call checkServerError() as seen here: http://hg.mozilla.org/mozilla-central/file/d7e78f0c1465/services/sync/modules/service.js#l624 I think we should get rid of this nested try/catch and consolidate all error handling in the existing outermost try/catch. Patch forthcoming.
Quick patch to show what I mean; tomorrow I will see about getting the tests running etc and check whether this actually does what I think it should...
(And of course there's the deeper question of why this simple request timed out, which we should also resolve)
Comment on attachment 8472824 [details] [diff] [review] sync-crypto-keys-error-handling.diff Marking for feedback from :ckarlof - does this seem like a sensible cleanup?
Attachment #8472824 - Flags: feedback?(ckarlof)
Comment on attachment 8472824 [details] [diff] [review] sync-crypto-keys-error-handling.diff I think this is worthwhile. Note however that in the existing code, the final catch block doesn't set status.login, whereas with this patch we end up with it being set to LOGIN_FAILED - I'm not sure if that's a problem
Attachment #8472824 - Flags: feedback?(ckarlof) → feedback+
I've no idea what the implications of setting status.login might be; perhaps it's safer to check add a call into the existing error path like the attached. I tried to write a test for this but couldn't figure out a way to trigger the exception, short of monkey-patching the singleton Service.handleFetchedKeys() to throw an error. Which seems awful.
Attachment #8472824 - Attachment is obsolete: true
OK, so what would it take to solve this issue? I have 4 machines already that do not sync anymore; Debian testing and Devuan stable (jessie) installations. After jumping through hoops I could make the log visible. The problem occurs at a fresh installation. Old installations (> 2 years ago) keep syncing with no problem. The bad thing is that Firefox is not even telling you that syncing failed, in an existing setup. You only find out when you arrive at home and see that your bookmarks are not in sync. Do I need to donate somewhere? Steven
(In reply to Steven Buytaert from comment #11) > After jumping through hoops I could make the log visible. Please attach it and we might be able to help.
As requested by someone; it is basically the same as all other logs reporting on this issue, AFAICS.
So this is a timeout talking to the auth servers. Seeing this intermittently would point at intermittent network issues, but if it is always happening for some people it implies some kind of mis-configuration somewhere, possibly along the lines of bug 1118958?
People waiting for a root cause solution from Mozilla, could try to lower the MTU on the interface they use; e.g. sudo ifconfig eth0 mtu 1200 and try to resync. Note, I am NOT behind an IndyMedia ISP. Note 2, that this is transient and lost after a reboot. Of course, replace eth0 with the interface routed to your ISP.
We took the error bar out :)
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: