Closed Bug 667060 Opened 9 years ago Closed 9 years ago

resp.obj is null in staging upload of bookmarks

Categories

(Cloud Services :: Operations: Miscellaneous, task)

task
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rnewman, Assigned: petef)

References

Details

(Whiteboard: [qa+])

Follow-up from email/IRC.

This looks suspect:

326529024[103002190]: max hang time exceeded!
I'm seeing similar in history.
308951402725	Service.Main	DEBUG	history failed: resp.obj is null JS Stack trace: ("all")@engines.js:930 < innerBind("all")@util.js:1352 < SyncEngine__uploadOutgoing()@engines.js:966 < SyncEngine__sync()@engines.js:1017 < WrappedNotify()@util.js:164 < Engine_sync()@engines.js:388 < WeaveSvc__syncEngine([object Object])@service.js:1917 < ()@service.js:1797 < WrappedNotify()@util.js:164 < WrappedLock()@util.js:132 < _lockedSync()@service.js:1699 < ()@service.js:1690 < WrappedCatch()@util.js:106 < sync()@service.js:1671 < (7)@browser.js:5346
petef's done some poking around.

14:57:30 < petef> claims nginx returns 499 when the client unexpectedly closes the connection
14:57:36 < petef> now technically the "client" is zeus in this case
14:57:43 < petef> so that means zeus exceeded it's 30 second timeout
14:58:04 < petef> i've been seeing a very small percentage of these in loadtest but marked them off to breaking mysql under load (like .04%)

15:02:22 < petef> zeus is timing out and closes the connection, is my theorey
15:02:25 < petef> but it should be sending a 503 then
15:02:27 < petef> so here's the zeus log line:
15:02:31 < petef> 10.250.5.41 stage-node02.services.mozilla.com yygprfpgeyc4nis2puw2dltgbd6e5njq [24/Jun/2011:10:00:17 -0700] "POST /1.0/yygprfpgeyc4nis2puw2dltgbd6e5njq/storage/bookmarks HTTP/1.1" - 0 "-" "Mozilla/5.0 (Windows NT 6.1; 
                  rv:2.0) Gecko/20100101 Firefox/4.0" "WT_FPC=id=63.245.220.240-3214383408.30159502:lv=1308959391310:ss=1308959391310; wtspl=481554" "ssl: SSL_RSA_WITH_RC4_128_SHA, version=TLSv1, bits=128"

15:02:39 < petef> that "-" is in place of the return code
15:02:45 < telliott> no status?
15:02:46 < telliott> yikes
Throwing this at ops, because it looks like it's a Zeus/nginx interaction.
Component: Firefox Sync: Backend → Operations
QA Contact: sync-backend → operations
Blocks: 666805
dropped the batch_size down to 20 for inserts, and we're no longer timing out
Assignee: nobody → petef
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Is it feasible to have a followup here that would result in more client-friendly behavior in the event of a timeout? E.g., can we send a 5xx response with a real body if Zeus hits its timeout?

The client operates under the assumption that it's always going to get a protocol error or a valid JSON body.
"499" is not currently trapped for conversion to a 503, but it can be if that's desirable. Pete?
667570 filed.  atoll, I don't think zeus actually sees a 499 (it disconnects and causes it).
I'm seeing a lot of "Sync encountered an error while syncing: Unknown error." messages at present, and I see a lot of those messages in about:sync-log :

1315523009529	Service.Main	DEBUG	history failed: resp.obj is null JS Stack trace: ("all")@engines.js:932 < innerBind("all")@util.js:1180 < SyncEngine__uploadOutgoing()@engines.js:968 < SyncEngine__sync()@engines.js:1019 < WrappedNotify()@util.js:169 < Engine_sync()@engines.js:390 < WeaveSvc__syncEngine([object Object])@service.js:1953 < ()@service.js:1833 < WrappedNotify()@util.js:169 < WrappedLock()@util.js:137 < _lockedSync()@service.js:1735 < ()@service.js:1726 < WrappedCatch()@util.js:111 < sync()@service.js:1707 < (1)@browser.js:5368 < @:0

is an example, but I have five in the sync-log just on this system. This bug is marked as FIXED, but I gotta admit I can't understand half the comments. =)

This system's running Firefox 6 on Fedora 16.
(In reply to Adam Williamson from comment #8)
> I'm seeing a lot of "Sync encountered an error while syncing: Unknown
> error." messages at present, and I see a lot of those messages in
> about:sync-log :

We'd be happy to consider your problem further if you could file a new bug about it under Mozilla Services / General.  However this bug is closed and no further action will be taken on requests here.
Blocks: 696137
Whiteboard: [qa+]
You need to log in before you can comment on or make changes to this bug.