Last Comment Bug 739519 - Timed Sync is throwing HTTPFailureException 412 error
: Timed Sync is throwing HTTPFailureException 412 error
Status: VERIFIED FIXED
: regression
Product: Android Background Services
Classification: Client Software
Component: Android Sync (show other bugs)
: unspecified
: ARM Android
: P1 normal
: ---
Assigned To: Marina Samuel [:emtwo]
:
:
Mentors:
Depends on: 740170
Blocks: 740247 747065
  Show dependency treegraph
 
Reported: 2012-03-26 22:26 PDT by Tony Chung [:tchung]
Modified: 2013-04-04 13:48 PDT (History)
10 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
fixed
fixed
+


Attachments
logcat (244.99 KB, text/plain)
2012-03-26 22:26 PDT, Tony Chung [:tchung]
no flags Details
2nd logcat (20.29 KB, text/plain)
2012-03-27 23:37 PDT, Tony Chung [:tchung]
no flags Details
3rd lolcat (14.75 KB, text/plain)
2012-03-28 11:55 PDT, Aaron Train [:aaronmt]
no flags Details

Description Tony Chung [:tchung] 2012-03-26 22:26:12 PDT
Created attachment 609617 [details]
logcat

saw this trolling through logcat.  my account has sync setup, and it looks like a HTTPFailureException 412 error is generated on a timed sync.

full logcat attached.

<HTTPFailureException 412 :: (412 Precondition Failed
03-26 22:18:14.132: WARN/GlobalSession(20172): Request precondition failed.
03-26 22:18:14.132: WARN/GlobalSession(20172):  clients)>
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.stage.SyncClientsEngineStage$ClientUploadDelegate.handleRequestFailure(SyncClientsEngineStage.java:200)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageRequest$SyncStorageResourceDelegate.handleHttpResponse(SyncStorageRequest.java:115)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.execute(BaseResource.java:232)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.go(BaseResource.java:254)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.put(BaseResource.java:282)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageRecordRequest.put(SyncStorageRecordRequest.java:113)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageRecordRequest.put(SyncStorageRecordRequest.java:124)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.stage.SyncClientsEngineStage.uploadClientRecord(SyncClientsEngineStage.java:338)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.stage.SyncClientsEngineStage.checkAndUpload(SyncClientsEngineStage.java:306)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.stage.SyncClientsEngineStage$ClientDownloadDelegate.handleRequestSuccess(SyncClientsEngineStage.java:99)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageCollectionRequest$SyncCollectionResourceDelegate.handleHttpResponse(SyncStorageCollectionRequest.java:142)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.execute(BaseResource.java:232)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.go(BaseResource.java:254)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.get(BaseResource.java:260)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageRequest.get(SyncStorageRequest.java:163)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.stage.SyncClientsEngineStage.downloadClientRecords(SyncClientsEngineStage.java:325)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.stage.SyncClientsEngineStage.execute(SyncClientsEngineStage.java:234)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.GlobalSession.advance(GlobalSession.java:233)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.stage.EnsureKeysStage.handleRequestSuccess(EnsureKeysStage.java:84)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageRequest$SyncStorageResourceDelegate.handleHttpResponse(SyncStorageRequest.java:107)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.execute(BaseResource.java:232)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.go(BaseResource.java:254)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.get(BaseResource.java:260)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageRequest.get(SyncStorageRequest.java:163)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.stage.EnsureKeysStage.execute(EnsureKeysStage.java:37)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.GlobalSession.advance(GlobalSession.java:233)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.GlobalSession.processMetaGlobal(GlobalSession.java:446)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.stage.FetchMetaGlobalStage$StageMetaGlobalDelegate.handleSuccess(FetchMetaGlobalStage.java:58)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.MetaGlobal.handleDownloadSuccess(MetaGlobal.java:181)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.MetaGlobal.handleRequestSuccess(MetaGlobal.java:167)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageRequest$SyncStorageResourceDelegate.handleHttpResponse(SyncStorageRequest.java:107)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.execute(BaseResource.java:232)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.go(BaseResource.java:254)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.BaseResource.get(BaseResource.java:260)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageRequest.get(SyncStorageRequest.java:163)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at org.mozilla.gecko.sync.net.SyncStorageRecordRequest$1.run(SyncStorageRecordRequest.java:132)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:442)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:305)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at java.util.concurrent.FutureTask.run(FutureTask.java:137)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1076)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:569)
03-26 22:18:14.132: WARN/GlobalSession(20172):     at java.lang.Thread.run(Thread.java:856)
03-26 22:18:14.132: INFO/SyncAdapter(20172): GlobalSession indicated error. Flagging auth token as invalid, just in case.
03-26 22:18:14.132: INFO/SyncAdapter(20172): Notifying sync monitor.
03-26 22:18:14.132: INFO/SyncAdapter(20172): Setting minimum next sync time to 1332825794142
03-26 22:18:14.163: DEBUG/class ch.boye.httpclientandroidlib.impl.conn.tsccm.ThreadSafeClientConnManager(20172): Closing expired connections
03-26 22:18:14.163: DEBUG/class ch.boye.httpclientandroidlib.impl.conn.tsccm.ConnPoolByRoute(20172): Closing expired connections
03-26 22:18:14.163: DEBUG/class ch.boye.httpclientandroidlib.impl.conn.tsccm.ThreadSafeClientConnManager(20172): Shutting down
03-26 22:18:14.163: DEBUG/class ch.boye.httpclientandroidlib.impl.conn.DefaultClientConnection(20172): Connection closed

Repro:
1) 3-26-2012 fennec nightly, Galaxy Nexus 4.0.2
2) set up sync 
3) a timed sync triggers the 412 error.

Expected;
- no error on timed sync

Actual:
- error timed sync
Comment 1 Richard Newman [:rnewman] 2012-03-26 22:29:36 PDT
Incorrect timestamp in upload?

Take a look, Marina. Feel free to unassign if you don't see a root cause.
Comment 2 Tony Chung [:tchung] 2012-03-27 00:04:32 PDT
happens each time a timed sync kicks in for me.
Comment 3 Scoobidiver (away) 2012-03-27 04:56:40 PDT
(In reply to Richard Newman [:rnewman] from comment #1)
> Take a look, Marina. Feel free to unassign if you don't see a root cause.
There's a bunch of Marina in Bugzilla.
Comment 4 Tony Chung [:tchung] 2012-03-27 08:29:55 PDT
(In reply to Scoobidiver from comment #3)
> (In reply to Richard Newman [:rnewman] from comment #1)
> > Take a look, Marina. Feel free to unassign if you don't see a root cause.
> There's a bunch of Marina in Bugzilla.

cc'd the correct marina.
Comment 5 Aaron Train [:aaronmt] 2012-03-27 08:30:52 PDT
I get this too; Nightly (03/27) / Galaxy Nexus (Android 4.0.2) immediately following a manual 'Sync Now'.
Comment 6 Richard Newman [:rnewman] 2012-03-27 08:37:50 PDT
(In reply to Scoobidiver from comment #3)
> There's a bunch of Marina in Bugzilla.

(In reply to Tony Chung [:tchung] from comment #4)
> cc'd the correct marina.

Marina is (or should be) watching the component.

I assigned to the correct one, but Tony and I had a silent mid-air collision.
Comment 7 Allison Naaktgeboren :ally 2012-03-27 11:43:15 PDT
marina, have you had a chance to investigate?
Comment 8 Marina Samuel [:emtwo] 2012-03-27 12:58:27 PDT
Ally: still looking into it.
Comment 9 Marina Samuel [:emtwo] 2012-03-27 13:30:50 PDT
tchung: To help figure out the cause (since I'm having trouble reproducing) I have a couple of questions:

* Does sync error if you force a sync?
* Does this occur every time you re-setup sync? Or is it just with a particular account that the error keeps occurring?
Comment 10 Tony Chung [:tchung] 2012-03-27 14:23:21 PDT
(In reply to Marina Samuel [:emtwo] from comment #9)
> tchung: To help figure out the cause (since I'm having trouble reproducing)
> I have a couple of questions:
> 
> * Does sync error if you force a sync?
> * Does this occur every time you re-setup sync? Or is it just with a
> particular account that the error keeps occurring?

I saw this firing a lot.  This was not forcing sync, just sitting idle.   Unfortunately,fennec has been crashing for me lately on another issue, but this exception does not help narrow it down.   

I can look at this more tonite.  but in the meantime, i recommend you try grabbing the nightly fennec build i used from the ftp.
Comment 11 Marina Samuel [:emtwo] 2012-03-27 14:42:26 PDT
I did try it with nightly fennec, but I may have grabbed the wrong build. This is the one I tried: 

https://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/2012-03-27-10-05-41-mozilla-central-android/fennec-14.0a1.en-US.android-arm.apk

Should I try another one?
Comment 12 Tony Chung [:tchung] 2012-03-27 15:15:26 PDT
Ah, i didnt mention it earlier.  but i was running the 3-26 nightly.  i really dont think that makes a difference, but if you want to be on par, you can try and see if that fails for you.

Also, it may be my profile.
Comment 13 Richard Newman [:rnewman] 2012-03-27 20:42:58 PDT
Happening to dougt, too. I've asked for a trace log.
Comment 14 Richard Newman [:rnewman] 2012-03-27 20:57:33 PDT
Some thinking:

The only way this can happen is if the timestamp for our client record on the server is different to the timestamp that we have.

Idea: millisecondsToDecimalSecondsString is returning a different string representation to the server. Once wedged in this situation, upload will always fail.

We don't log that right now, so let me land some logging for it…
Comment 15 Tony Chung [:tchung] 2012-03-27 23:37:52 PDT
Created attachment 610031 [details]
2nd logcat

Here's another logcat.   This time, i'm on 3-27 nightly, and its just sitting there idle after starting up.   I didnt initiate any activity on fennec.  it looks like the sync timer kicked in, and find the HTTPFailureException 412 log kick in again.
Comment 16 Tony Chung [:tchung] 2012-03-28 00:01:06 PDT
Ah, a force sync in Sync settings, will also trigger this.   This didnt used to happen.  Regression?
Comment 17 Richard Newman [:rnewman] 2012-03-28 01:25:50 PDT
(In reply to Tony Chung [:tchung] from comment #16)
> Ah, a force sync in Sync settings, will also trigger this.   This didnt used
> to happen.  Regression?

That depends on whether you think taking the bus, buying a new car, crashing it, and then taking the bus the next day counts as a "transport regression" :D

I'm still unable to reproduce this, and apparently so is Marina.

I have some guesses as to the cause, but Marina will be chasing tomorrow!
Comment 18 Aaron Train [:aaronmt] 2012-03-28 11:55:25 PDT
Created attachment 610233 [details]
3rd lolcat
Comment 19 Richard Newman [:rnewman] 2012-03-28 13:58:46 PDT
This might be why.

http://people.mozilla.com/~rnewman/bugs/header.png

The Sync server is returning two *different* timestamps in response to a PUT. Only one of them will pass validation. I haven't threaded through our code yet to see if we use these values rather than the modified time in a GET, but it's a little suspect…
Comment 20 Richard Newman [:rnewman] 2012-03-28 14:07:54 PDT
(Flagsigh.)

Confirmed that using the X-Weave-Timestamp header instead of the PUT response body can cause a 412.

[13:52:59.002] p.put(resp);
[13:52:59.323] (new String("1332967979.23"))
--
// Header is .22.
[14:04:30.048] p.setHeader("X-If-Unmodified-Since", "1332967979.22");
[14:04:30.055] undefined
[14:04:32.488] p.put(resp);
[14:04:32.936] (new String("<html>\n <head>\n  <title>412 Precondition Failed</title>\n </head>\n <body>\n  <h1>412 Precondition Failed</h1>\n  Request precondition failed.<br /><br />\nclients\n\n\n </body>\n</html>"))

So two areas of confusion:

* Server-side, why is this happening?
* Client-side, why are we not subsequently downloading our own record and thus bumping our modified time? (That is, before we do another upload, we will first download our own record, right?)

Perhaps the answer to the latter is that we don't suck, and are circumspect about whether we download, as we should be…
Comment 21 Ryan Kelly [:rfkelly] 2012-03-28 15:26:07 PDT
I have managed to reproduce this as a server-side test case, but it only happens when PUTing a metadata update, i.e. when the PUT body does not contain a "payload" field.  Richard, can you confirm whether this is the case when the error occurs for the client?
Comment 22 Richard Newman [:rnewman] 2012-03-28 16:49:51 PDT
Gist for JS code to repro:

https://gist.github.com/2231561
Comment 23 Richard Newman [:rnewman] 2012-03-28 16:53:57 PDT
(In reply to Ryan Kelly [:rfkelly] from comment #21)
> I have managed to reproduce this as a server-side test case, but it only
> happens when PUTing a metadata update, i.e. when the PUT body does not
> contain a "payload" field.  Richard, can you confirm whether this is the
> case when the error occurs for the client?

This is uploading a whole record (most likely with an identical payload to the one on the server). Every PUT body sent by every deployed Sync client includes a payload.
Comment 24 Richard Newman [:rnewman] 2012-03-28 17:27:54 PDT
Waiting for the tree to open before landing a quick fix, then we'll figure out if there's anything wrong with the client while we wait for a server fix.
Comment 25 Richard Newman [:rnewman] 2012-03-28 18:40:16 PDT
https://hg.mozilla.org/mozilla-central/rev/8365faf3a156
https://hg.mozilla.org/mozilla-central/rev/1965a2c89d61

I landed Marina's quick fix (don't send X-If-Unmodified-Since at all) and a logging addition patch.

Landed directly onto m-c so that it unbreaks Nightly tomorrow.

After this we can switch to using the body until the Sync storage server is fixed, and also examine the client in case there are other problems.

Back over to you, Marina!
Comment 26 Richard Newman [:rnewman] 2012-03-28 18:42:44 PDT
Marking this as fixed to avoid confusion; Marina, please file a follow-up bug for using the body value (blocked on Bug 740170 and this bug) and restoring X-I-U-S, and new bugs for anything you find b0rked about download.

Ta!
Comment 27 Richard Newman [:rnewman] 2012-03-28 22:23:24 PDT
To summarize, we think the following is occurring:

The server is returning two different timestamps in response to a PUT (Bug 740170). We -- thanks to sod's law -- use the one that will cause a 412 if we use it on a subsequent PUT. That means that if we decide to upload a record, and we didn't download our record again in the interim (which overwrites the timestamp), then we'll get a 412 on every attempt.

But we should always redownload (we do every time), so this should not occur.

Our hypothesis is that either the client record was removed from the server by a wipe or node reassignment, or the Sync account was removed and re-added on the phone.

This will leave prefs hanging around that suggest a timestamp to include in X-I-U-S for a PUT, even when there is no record to replace on the server (either because it was removed, or because we generated a new client ID). The server correctly returns a 412 if you include X-I-U-S and there is no record on the server with that ID.

We've worked around all this by simply omitting X-I-U-S. Further work continues in Bug 740247 (and, of course, Bug 740170).
Comment 28 Tony Chung [:tchung] 2012-04-02 21:16:23 PDT
Verified fix on 4-2-2012 trunk.
Comment 30 Ryan Kelly [:rfkelly] 2012-05-16 15:06:53 PDT
The underlying server bug (Bug 740170) has been fixed and deployed per Bug 748164.  It should no longer be possible to get different timestamps in response header and response body from the mozilla-hosted sync server.

Note You need to log in before you can comment on or make changes to this bug.