Closed
Bug 752687
Opened 13 years ago
Closed 13 years ago
Android Sync hangs on sync when server has missing (404) meta/global record
Categories
(Firefox for Android Graveyard :: Android Sync, defect)
Tracking
(blocking-fennec1.0 beta+)
RESOLVED
FIXED
| Tracking | Status | |
|---|---|---|
| blocking-fennec1.0 | --- | beta+ |
People
(Reporter: nalexander, Assigned: nalexander)
Details
(Whiteboard: [qa+])
Attachments
(12 files, 7 obsolete files)
|
1.76 KB,
text/plain
|
Details | |
|
22.62 KB,
text/plain
|
Details | |
|
5.17 KB,
text/plain
|
Details | |
|
3.70 KB,
text/plain
|
Details | |
|
5.10 KB,
patch
|
rnewman
:
review+
mfinkle
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
|
328.67 KB,
text/plain
|
Details | |
|
185.76 KB,
text/plain
|
Details | |
|
3.34 KB,
patch
|
Details | Diff | Splinter Review | |
|
17.82 KB,
text/plain
|
Details | |
|
8.41 KB,
text/plain
|
Details | |
|
5.01 KB,
text/plain
|
Details | |
|
27.77 KB,
patch
|
Details | Diff | Splinter Review |
On finding a missing (404) meta/global record, `FetchMetaGlobalStage` calls `GlobalSession.processMissingMetaGlobal()`.
`GlobalSession.processMissingMetaGlobal()`calls `GlobalSession.freshStart()` which calls `GlobalSession.freshStart(GlobalSession, FreshStartDelegate)` which calls `GlobalSession.wipeServer(GlobalSession, WipeServerDelegate)`.
The wipe server callback `onWiped` calls `MetaGlobal.upload(MetaGlobalDelegate)`, which has not been implemented. `MetaGlobal.upload(MetaGlobalDelegate)` calls `SyncStorageRecordRequest.deferPut(JSONObject)` with parameter `null` (instead of generating a valid meta/global record body -- this is most of the content of Bug 709313).
`SyncStorageRecordRequest.deferPut(JSONObject)` spawns a new thread that calls `put` with the parameter. This put tries calls `.toJSONObject()` on the parameter, which should throw `NullPointerException`. This exception doesn't appear to be caught anywhere, and isn't reported. Instead, we just die and never alert the monitor to fail the sync.
| Assignee | ||
Comment 1•13 years ago
|
||
| Assignee | ||
Comment 2•13 years ago
|
||
(In reply to Nick Alexander :nalexander from comment #1)
> Created attachment 621732 [details]
> Suggested minimal patch against m-c
This patch makes `MetaGlobal.upload(MetaGlobalDelegate)</tt> fail immediately. This will abort the sync immediately. Android Sync clients will continue to fail until Android stops them syncing or a Desktop client initializes the server with a correct meta/global record.
| Assignee | ||
Comment 3•13 years ago
|
||
| Assignee | ||
Comment 4•13 years ago
|
||
Comment 5•13 years ago
|
||
Just clarifying for triage: the purpose of this bug is to be an immediate stop-gap that we can get into a beta to avoid incorrect behavior before Bug 745430 and Bug 745431 are ready.
The idea is to minimize risk.
Comment 6•13 years ago
|
||
Nick's been doing some analysis of failure cases as part of his work on those bugs. We're tracking those here:
https://wiki.mozilla.org/User:Anaaktgeboren/NativeSync#Cases_that_need_testing_for_blank_server_and_corrupted_server
| Assignee | ||
Comment 7•13 years ago
|
||
Attachment #621732 -
Attachment is obsolete: true
| Assignee | ||
Comment 8•13 years ago
|
||
| Assignee | ||
Comment 9•13 years ago
|
||
| Assignee | ||
Comment 10•13 years ago
|
||
| Assignee | ||
Comment 11•13 years ago
|
||
Attachment #621832 -
Attachment is obsolete: true
| Assignee | ||
Comment 12•13 years ago
|
||
(In reply to Nick Alexander :nalexander from comment #11)
> Created attachment 621862 [details]
> Suggested patch against mozilla-aurora
This patch is longer because I commented things out as block comments. I think this makes it easier to see exactly what has been removed. I also made the session abort messages different (and hopefully more instructive) in each case. See attached logs.
| Assignee | ||
Comment 13•13 years ago
|
||
Attachment #621733 -
Attachment is obsolete: true
| Assignee | ||
Comment 14•13 years ago
|
||
Attachment #621834 -
Attachment is obsolete: true
| Assignee | ||
Comment 15•13 years ago
|
||
Attachment #621734 -
Attachment is obsolete: true
Attachment #621833 -
Attachment is obsolete: true
Comment 16•13 years ago
|
||
Extracted from earlier communiques:
This will let us "do no harm" until we're ready to land stable release blocker code, which will include a more thorough solution to this case (namely, doing the right thing!) and address the remaining scenarios that we need to cover eventually, but are too involved to band-aid.
In my opinion this hits the sweet spot of very low risk and tidy damage-avoidance, and so I'd like to get this landed before we cut a beta if we can.
Attachment #621862 -
Attachment is obsolete: true
Attachment #622011 -
Flags: review+
Attachment #622011 -
Flags: approval-mozilla-aurora?
Updated•13 years ago
|
Assignee: nobody → nalexander
Status: NEW → ASSIGNED
Comment 17•13 years ago
|
||
Local Aurora build:
http://people.mozilla.org/~rnewman/fennec_meta.apk
Comment 18•13 years ago
|
||
STRs via irc:
rnewman: you can test by deleting an account through Account Portal (if it's a newer account, it's more likely to not time out!)
[09:54am] rnewman: assuming that just wipes data
[09:55am] rnewman: failing that, do the "delete server" case here: https://wiki.mozilla.org/Services/Sync/Snippets#Partially_corrupt_a_server
[09:55am] tchung: rnewman: so what's the str -- sync account on android, delete account from AP, and try to resync on android?
[09:57am] rnewman: I suggest using the snippet instead, because on reflection I don't know precisely what AP does
[09:57am] rnewman: but yes
[09:57am] rnewman: setup, sync, clear server from desktop (and quit desktop to avoid another sync), sync android, verify from log that it doesn't upload a meta/global or hang
Whiteboard: [qa+]
| Assignee | ||
Comment 19•13 years ago
|
||
My test procedure:
*** verify trivial crypto/keys changes are processed correctly
- sync device to existing account
- bumped crypto/keys modified time using
Components.utils.import("resource://services-sync/main.js");
Components.utils.import("resource://services-sync/resource.js");
let r = new Resource(Weave.Service.storageURL + "crypto/keys");
let g = r.get();
r.put(g);
- device sync completed just fine, and correctly fetched fresh keys quietly:
I GlobalSession(13112) Running next stage ensureKeysStage (org.mozilla.gecko.sync.stage.EnsureCrypto5KeysStage@411f9870)...
I EnsureC5KeysStage(13112) Fetching fresh collection keys for this session.
D dalvikvm(13112) GC_CONCURRENT freed 556K, 10% free 8254K/9095K, paused 2ms+4ms
I EnsureC5KeysStage(13112) New default key is the same as old default key; returning changed individual keys.
*** verify deleted crypto/keys record correctly aborts sync
- sync device to existing account
- deleted crypto keys using
Components.utils.import("resource://services-sync/main.js");
Components.utils.import("resource://services-sync/resource.js");
function deletePath(path) {
let resource = new Resource(Weave.Service.storageURL + path);
resource.setHeader("X-Confirm-Delete", "1");
return resource.delete();
}
deletePath("crypto/keys");
- device sync correctly aborted with
W GlobalSession(19091) Aborting sync: aborting after finding keys missing because uploading fresh crypto/keys record is not yet implemented.
*** verify missing meta/global record correctly aborts sync
- sync device to existing account
- wipe server, either using Account Portal or
Components.utils.import("resource://services-sync/main.js");
Components.utils.import("resource://services-sync/resource.js");
function deletePath(path) {
let resource = new Resource(Weave.Service.storageURL + path);
resource.setHeader("X-Confirm-Delete", "1");
return resource.delete();
}
deletePath("");
- device sync correctly aborted with
W GlobalSession(19091) Aborting sync: aborting freshStart because uploading fresh meta/global record is not yet implemented.
Comment 20•13 years ago
|
||
dear mobile triage: as per rnewman's email to drivers this morning, we desire to uplift this, so as to do no harm while we fix teh release blockers related to the following issues: <https://wiki.mozilla.org/User:Anaaktgeboren/NativeSync#Cases_that_Android_Sync_does_not_handle:_blank_and_corrupted_servers>
blocking-fennec1.0: --- → ?
Updated•13 years ago
|
blocking-fennec1.0: ? → beta+
Updated•13 years ago
|
Attachment #622011 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 21•13 years ago
|
||
blocking-fennec1.0: beta+ → ---
status-firefox14:
--- → fixed
Comment 22•13 years ago
|
||
mfinkle & rnewman's race dropped the flag :)
blocking-fennec1.0: --- → beta+
Comment 23•13 years ago
|
||
didnt see the meta/global issue reported in comment 0, but here's my log. Tested on HTC Desire, rnewman's try build.
Comment 24•13 years ago
|
||
I get an expected 401 Unauthorized. No hang and Sync account on Android is saying "Sync is currently experiencing problems. It will be back shortly."
Comment 25•13 years ago
|
||
I reran the tests again, and this time i'm hitting the blank server during sync.
STR: Nexus S, rnewman's build --> sync android, on desktop run snippets (but dont kill desktop), and force sync on android afterwards.
05-08 21:41:42.031: INFO/SyncAdapter(718): Forced sync: overruling remaining backoff of 218937ms.
05-08 21:41:42.031: INFO/SyncAdapter(718): Got onPerformSync. Extras bundle is Bundle[{ignore_backoff=true, ignore_settings=true, force=true}]
05-08 21:41:42.031: INFO/SyncAdapter(718): Account name: testme@me.com
05-08 21:41:42.031: DEBUG/SyncAdapter(718): Invalidating auth token.
05-08 21:41:42.265: INFO/SyncAuthService(718): AccountManager.get(org.mozilla.gecko.sync.setup.SyncAuthenticatorService@40713fb0)
05-08 21:41:42.277: INFO/SyncAdapter(718): Waiting on sync monitor.
05-08 21:41:42.304: INFO/SyncAuthService(718): Setting username. Null? false
05-08 21:41:42.308: INFO/SyncAuthService(718): Setting Sync Key. Null? false
05-08 21:41:42.410: INFO/SyncAdapter(718): AccountManagerCallback invoked.
05-08 21:41:42.410: DEBUG/SyncAdapter(718): Username: 5hdzttjtqbtegdx53wirzg5rxzsybfon
05-08 21:41:42.410: DEBUG/SyncAdapter(718): Server: https://auth.services.mozilla.com/
05-08 21:41:42.410: DEBUG/SyncAdapter(718): Password? true
05-08 21:41:42.410: DEBUG/SyncAdapter(718): Key? true
05-08 21:41:42.460: INFO/SyncAdapter(718): Performing sync.
05-08 21:41:42.460: INFO/GlobalSession(718): GlobalSession initialized with bundle Bundle[{ignore_backoff=true, ignore_settings=true, force=true}]
05-08 21:41:42.480: INFO/SyncConfiguration(718): Set clusterURL from bundle: https://phx-sync322.services.mozilla.com/
05-08 21:41:42.480: INFO/SyncConfiguration(718): Set syncID from bundle: m6PuivGPdMBh
05-08 21:41:42.480: INFO/SyncAdapter(718): Stage completed: idle
05-08 21:41:42.484: INFO/GlobalSession(718): Running next stage checkPreconditions (org.mozilla.gecko.sync.stage.CheckPreconditionsStage@4072d4f8)...
05-08 21:41:42.484: INFO/SyncAdapter(718): Stage completed: checkPreconditions
05-08 21:41:42.484: INFO/GlobalSession(718): Running next stage ensureClusterURL (org.mozilla.gecko.sync.stage.EnsureClusterURLStage@4072afe8)...
05-08 21:41:42.484: INFO/EnsureClusterURLStage(718): Cluster URL is already set and not stale. Continuing with sync.
05-08 21:41:42.484: INFO/SyncAdapter(718): Stage completed: ensureClusterURL
05-08 21:41:42.484: INFO/GlobalSession(718): Running next stage fetchInfoCollections (org.mozilla.gecko.sync.stage.FetchInfoCollectionsStage@4071df70)...
05-08 21:41:42.953: INFO/InfoCollections(718): info/collections is {}
05-08 21:41:42.953: INFO/FetchInfoCollStage(718): Got timestamps:
05-08 21:41:42.953: INFO/SyncAdapter(718): Stage completed: fetchInfoCollections
05-08 21:41:42.957: INFO/GlobalSession(718): Running next stage fetchMetaGlobal (org.mozilla.gecko.sync.stage.FetchMetaGlobalStage@4071ae70)...
05-08 21:41:42.957: INFO/FetchMetaGlobalStage(718): Fetching fresh meta/global for this session.
05-08 21:41:43.070: ERROR/GPS(112): [on_request_connection][line = 1061] : Failed with NO SESSION SLOT
05-08 21:41:43.191: WARN/SSResourceDelegate(718): HTTP request failed.
05-08 21:41:43.214: WARN/SSResourceDelegate(718): HTTP response body: 404 Not Found
05-08 21:41:43.214: WARN/SSResourceDelegate(718): The resource could not be found.
05-08 21:41:43.218: WARN/GlobalSession(718): Aborting sync: aborting freshStart because uploading fresh meta/global record is not yet implemented.
05-08 21:41:43.218: INFO/SyncAdapter(718): GlobalSession indicated error. Flagging auth token as invalid, just in case.
Comment 26•13 years ago
|
||
Any reason not to land this on inbound too?
Comment 27•13 years ago
|
||
(In reply to JP Rosevear [:jpr] from comment #26)
> Any reason not to land this on inbound too?
can someone from sync dev confirm my log results is correct and addresses the original hang?
Comment 28•13 years ago
|
||
(In reply to JP Rosevear [:jpr] from comment #26)
> Any reason not to land this on inbound too?
Rolling up a final patch for inbound now.
(In reply to Tony Chung [:tchung] from comment #27)
> can someone from sync dev confirm my log results is correct and addresses
> the original hang?
Confirmed.
| Assignee | ||
Comment 29•13 years ago
|
||
| Assignee | ||
Comment 30•13 years ago
|
||
| Assignee | ||
Comment 31•13 years ago
|
||
| Assignee | ||
Comment 32•13 years ago
|
||
Updated•13 years ago
|
Whiteboard: [qa+] → [qa+][landed for beta][needs to land on inbound]
Comment 33•13 years ago
|
||
We (mconnor & I) have decided not land this on inbound, since this is just a bandaid for aurora & beta while the proper patches land.
Thanks to nick for uploading inbound patches.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: [qa+][landed for beta][needs to land on inbound] → [qa+]
Updated•12 years ago
|
Product: Mozilla Services → Android Background Services
Comment 34•9 years ago
|
||
status-firefox14:
fixed → ---
Flags: needinfo?(nalexander)
| Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(nalexander)
Updated•8 years ago
|
Product: Android Background Services → Firefox for Android
Updated•4 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•