Closed Bug 973962 Opened 10 years ago Closed 8 years ago

400 in prod upload of tabs (Sync 1.1)

Categories

(Cloud Services Graveyard :: Server: Sync, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: rnewman, Assigned: bobm)

Details

(Whiteboard: [qa+])

1392742859421	Sync.Engine.Tabs	INFO	Uploading all of 1 records
1392742859421	Sync.Collection	DEBUG	POST Length: 11620
1392742859519	Sync.Collection	DEBUG	mesg: POST fail 400 https://phx-sync545.services.mozilla.com/1.1/holygoat/storage/tabs
1392742859519	Sync.Collection	DEBUG	POST fail 400 https://phx-sync545.services.mozilla.com/1.1/holygoat/storage/tabs
1392742859519	Sync.Engine.Tabs	DEBUG	Uploading records failed: <h2>Client sent a bad request.</h2>


Can someone pull a log and see if there's a bug here?
Adding :jlaz since he can also pull logs from PHX1 for you...
IIRC these "<h2>Client sent a bad request.</h2>" responses come from zeus rather than the app; is this e.g. failing a body-length check or something else at the zeus level.

(Or, I shudder to think, is this the return of the infamous "OST method" bug?)
(In reply to Ryan Kelly [:rfkelly] from comment #2)
> IIRC these "<h2>Client sent a bad request.</h2>" responses come from zeus
> rather than the app; is this e.g. failing a body-length check or something
> else at the zeus level.
> 
> (Or, I shudder to think, is this the return of the infamous "OST method"
> bug?)

Smells like OST to me.
Having a look.
Assignee: nobody → bobm
Status: NEW → ASSIGNED
There are a total of 20 "OST failures for all of today, none of which match this user.  The user has successful posts throughout the day, but no 400 level errors.  Nor, are there any tabs post records precisely matching the timestamp 1392742859519, which should equate to 09:00:59 in the Sync logs.  Here's the nearest records, with the source IP masked:

1.1.1.1 phx-sync545.services.mozilla.com holygoat [18/Feb/2014:09:00:48 -0800] "GET /1.1/holygoat/info/collections HTTP/1.1" 200 458 "-" "Firefox/30.0a1 FxSync/1.32.0.20140217030202.desktop" "-" "ssl: SSL_RSA_WITH_RC4_128_SHA, version=TLSv1.1, bits=128" node_s:1.335280 req_s:1.508051 retries:0 req_b:428 "c_l:-"
1.1.1.1 phx-sync545.services.mozilla.com holygoat [18/Feb/2014:09:00:51 -0800] "POST /1.1/holygoat/storage/forms HTTP/1.1" 200 304 "-" "Firefox/30.0a1 FxSync/1.32.0.20140217030202.desktop" "-" "ssl: SSL_RSA_WITH_RC4_128_SHA, version=TLSv1.1, bits=128" node_s:3.561858 req_s:3.562138 retries:0 req_b:1130 "c_l:657"
1.1.1.1 phx-sync545.services.mozilla.com holygoat [18/Feb/2014:09:00:55 -0800] "POST /1.1/holygoat/storage/history HTTP/1.1" 200 489 "-" "Firefox/30.0a1 FxSync/1.32.0.20140217030202.desktop" "-" "ssl: SSL_RSA_WITH_RC4_128_SHA, version=TLSv1.1, bits=128" node_s:3.068811 req_s:3.249519 retries:0 req_b:16384 "c_l:9543"
Callek just hit this, too. Aurora.

1394465391671   Sync.Engine.Tabs   INFO       Uploading all of 1 records
1394465391671   Sync.Collection    DEBUG       POST Length: 26832
1394465391852   Sync.Collection    DEBUG       mesg: POST fail 400 https://phx-sync-8-1-1.services.mozilla.com/1.1/5np652gwsvsqng44qszokzaib7eydroc/storage/tabs
1394465391852   Sync.Collection    DEBUG       POST fail 400 https://phx-sync-8-1-1.services.mozilla.com/1.1/5np652gwsvsqng44qszokzaib7eydroc/storage/tabs
1394465391852   Sync.Engine.Tabs   DEBUG      Uploading records failed: <h2>Client sent a bad request.</h2>
 
1394465391852   Sync.Status        DEBUG   Status for engine tabs: error.engine.reason.record_upload_fail
1394465391852   Sync.Status        DEBUG   Status.service: success.status_ok => error.sync.failed_partial
1394465391852   Sync.ErrorHandler  DEBUG     tabs failed: <h2>Client sent a bad request.</h2>
 No traceback available
I just hit this a few times today, I can provide you all-the-data I have in my client sync logs if you need.

I'm unsure of scope of the issue, so I don't pretend to know where in a priority list this falls, but need-info'ing to see if its worth pulling out and giving you said log.
Flags: needinfo?(bobm)
Blocks: 907475
Priority: -- → P1
Whiteboard: [qa+]
Blocks: 907479
No longer blocks: 907475
(In reply to Justin Wood (:Callek) from comment #7)
> I just hit this a few times today, I can provide you all-the-data I have in
> my client sync logs if you need.
> 
> I'm unsure of scope of the issue, so I don't pretend to know where in a
> priority list this falls, but need-info'ing to see if its worth pulling out
> and giving you said log.

Please attach.  I will find the corresponding server side logs.
Flags: needinfo?(bobm)
I sent them in e-mail, since I'm not clear on if the logs contain details that could allow someone access to my Sync store.

Feel free to attach them here if not, or share them privately among trusted devs who need-to-know if there is.
Flags: needinfo?(bobm)
Did anything come of this bug?
Flags: needinfo?(rnewman)
Flags: needinfo?(jbonacci)
Justin: only Bob can grab the server logs.

HTML in the response implies that Zeus is rejecting the request, not the Sync server.

Tabs is the last thing to sync for my clients. It should be visible after the last log item in Comment 5. That it doesn't appear there implies that either Zeus isn't logging, or the error is being returned by something other than Zeus.

Beyond that, we're out of leads.

This shouldn't occur with Sync 1.5, at least.
Flags: needinfo?(rnewman)
From a QA perspective, I have nothing to add...
:bobm ?
Flags: needinfo?(jbonacci)
Here's the logs from that Sync:

A.B.C.D phx-sync-8-1-1.services.mozilla.com user1 [10/Mar/2014:07:48:42 -0700] "GET /1.1/user1/info/collections HTTP/1.1" 200 458 "-" "Firefox/29.0a2 FxSync/1.31.0.20140306004001.desktop" "-" "ssl: SSL_RSA_WITH_RC4_128_SHA, version=TLSv1.1, bits=128" node_s:1.571448 req_s:3.726842 retries:0 req_b:463 "c_l:-"
A.B.C.D phx-sync-8-1-1.services.mozilla.com user1 [10/Mar/2014:07:48:45 -0700] "POST /1.1/user1/storage/forms HTTP/1.1" 200 325 "-" "Firefox/29.0a2 FxSync/1.31.0.20140306004001.desktop" "-" "ssl: SSL_RSA_WITH_RC4_128_SHA, version=TLSv1.1, bits=128" node_s:1.150879 req_s:3.077164 retries:0 req_b:1505 "c_l:997"
A.B.C.D phx-sync-8-1-1.services.mozilla.com user1 [10/Mar/2014:07:49:02 -0700] "POST /1.1/user1/storage/history HTTP/1.1" 200 713 "-" "Firefox/29.0a2 FxSync/1.31.0.20140306004001.desktop" "-" "ssl: SSL_RSA_WITH_RC4_128_SHA, version=TLSv1.1, bits=128" node_s:13.283485 req_s:15.902741 retries:0 req_b:28672 "c_l:17318"
A.B.C.D phx-sync-8-1-1.services.mozilla.com user1 [10/Mar/2014:07:49:43 -0700] "GET /1.1/user1/info/collections HTTP/1.1" 200 458 "-" "Firefox/29.0a2 FxSync/1.31.0.20140306004001.desktop" "-" "ssl: SSL_RSA_WITH_RC4_128_SHA, version=TLSv1.1, bits=128" node_s:3.817565 req_s:3.997670 retries:0 req_b:463 "c_l:-"
A.B.C.D phx-sync-8-1-1.services.mozilla.com user1 [10/Mar/2014:07:49:57 -0700] "POST /1.1/user1/storage/history HTTP/1.1" 200 313 "-" "Firefox/29.0a2 FxSync/1.31.0.20140306004001.desktop" "-" "ssl: SSL_RSA_WITH_RC4_128_SHA, version=TLSv1.1, bits=128" node_s:13.665252 req_s:13.848840 retries:0 req_b:2529 "c_l:2018"
A.B.C.D - - [10/Mar/2014:07:49:02 -0700] "  " - 89 "-" "-" "-" "ssl: -" node_s:- req_s:0.086835 retries:0 req_b:16627 "c_l:-"

Note the tabs post is probably the last log line.
Flags: needinfo?(bobm)
No longer blocks: 907479
OK. So now what? Status?
(In reply to James Bonacci [:jbonacci] from comment #14)
> OK. So now what? Status?

Are we safe to close this?
I have no idea...
:rnewman ?
Not yet, I suspect. Let's wait and see.
Closing cause it's old and no new activity.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Resolution: WONTFIX → INCOMPLETE
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.