Closed
Bug 1316249
Opened 8 years ago
Closed 8 years ago
Fails uploading to tooltool RuntimeError: Non-200 return from AWS: 400 Bad Request
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(firefox52 fixed)
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox52 | --- | fixed |
People
(Reporter: ting, Unassigned)
Details
Attachments
(2 files)
Ran into this in bug 1251936, I am located in TPO.
$ python tooltool.py upload --authentication-file=.tooltool-token --message "Bug 1251936 - New Windows clang for x86
/x64, r283955"
INFO - clang64.tar.bz2: starting upload
INFO - clang32.tar.bz2: starting upload
ERROR - clang32.tar.bz2: failed
Traceback (most recent call last):
File "tooltool.py", line 791, in _s3_upload
(resp.status, resp.reason, resp_body))
RuntimeError: Non-200 return from AWS: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?><Error><Code>RequestTimeout</Code><Message>Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.</Message><RequestId>20E806D572EAA723</RequestId><HostId>tIFXayLqhUbP7PZ17XSaKW+PyUMNFBtCbGiUb3QFT37phf7xQmSjo3gvDosD7FvYo1i7PR+jDB8yxOEfus9fuI51i/5OM3oV</HostId></Error>
ERROR - clang64.tar.bz2: failed
Traceback (most recent call last):
File "tooltool.py", line 791, in _s3_upload
(resp.status, resp.reason, resp_body))
RuntimeError: Non-200 return from AWS: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?><Error><Code>RequestTimeout</Code><Message>Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.</Message><RequestId>CF7F00947BB490F2</RequestId><HostId>SpjvbaWXoeMW/vJTcNgMJ9p7gA9LAwkWvxq9P4sBzphpEHlehyytIg3ciDO8J7OhrHm8rijubqEDhVyMe9n1Y1QUCCSKktBh</HostId></Error>
Reporter | ||
Comment 1•8 years ago
|
||
Captured by command: ">WinDump.exe -s0 -v -w tcpdump.pcap", I tried but am bad at interpreting the log. :(
Flags: needinfo?(dustin)
Reporter | ||
Updated•8 years ago
|
Summary: Fails uploading to tooltool → Fails uploading to tooltool RuntimeError: Non-200 return from AWS: 400 Bad Request
Reporter | ||
Comment 2•8 years ago
|
||
BTW, I don't know how to delete the batches on https://api.pub.build.mozilla.org/tooltool/ from failed uploading.
![]() |
||
Comment 3•8 years ago
|
||
How big is the file?
Comment 4•8 years ago
|
||
Don't worry about the failed uploads; since they have no associated files, nobody will ever find them :)
Before diving in, I should qualify and say that I haven't heard of anyone else having this kind of issue.
Here's the relevant bit of the pcap. I don't know how familiar you are with TCP, but I'll write this out here anyway :)
look up api.pub:
06:05:08.687714 IP 10.247.30.97.63144 > 10.247.75.120.53: 47188+ A? api.pub.build.mozilla.org. (43)
06:05:08.688678 IP 10.247.75.120.53 > 10.247.30.97.63144: 47188 2/0/0 CNAME relengapi-zlb.vips.scl3.mozilla.com., A 63.245.215.27 (108)
look up the S3 bucket URL:
06:05:09.336483 IP 10.247.30.97.64338 > 10.247.75.120.53: 32472+ A? mozilla-releng-usw2-tooltool.s3-us-west-2.amazonaws.com. (73)
06:05:09.361308 IP 10.247.30.97.64338 > 10.247.75.120.53: 32472+ A? mozilla-releng-usw2-tooltool.s3-us-west-2.amazonaws.com. (73)
06:05:09.444379 IP 10.247.75.120.53 > 10.247.30.97.64338: 32472 2/0/0 CNAME s3-us-west-2-r-w.amazonaws.com., A 54.231.169.13 (120)
connect to S3:
06:05:09.445037 IP 10.247.30.97.49691 > 54.231.169.13.443: Flags [S], seq 1773387666, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
06:05:09.620559 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [S.], seq 1941016628, ack 1773387667, win 8192, options [mss 1432,wscale 8,nop,sackOK,nop,nop], length 0
06:05:09.620644 IP 10.247.30.97.49691 > 54.231.169.13.443: Flags [.], ack 1, win 257, length 0
TLS setup:
06:05:09.620919 IP 10.247.30.97.49691 > 54.231.169.13.443: Flags [P.], seq 1:518, ack 1, win 257, length 517
06:05:09.795139 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 518, win 62, length 0
06:05:09.798327 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], seq 1:1433, ack 518, win 62, length 1432
06:05:09.798394 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], seq 1433:2865, ack 518, win 62, length 1432
06:05:09.798395 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [P.], seq 2865:3150, ack 518, win 62, length 285
06:05:09.798444 IP 10.247.30.97.49691 > 54.231.169.13.443: Flags [.], ack 3150, win 257, length 0
06:05:09.800839 IP 10.247.30.97.49691 > 54.231.169.13.443: Flags [P.], seq 518:668, ack 3150, win 257, length 150
06:05:09.975943 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [P.], seq 3150:3225, ack 668, win 66, length 75
HTTP request (I'm guessing here, it's encrypted after all)
06:05:09.977644 IP 10.247.30.97.49691 > 54.231.169.13.443: Flags [P.], seq 668:1121, ack 3225, win 257, length 453
Push data; these have a length of 0 because your system is letting the network interface do the calculation. Judging by the later ACK's, these are legitimate, full-sized PSH segments (of about 1.5k each). I've guessed at the sequence numbers based on the ethernet packet sizes (from tcpdump -e).
06:05:09.977874 IP bad-len 0 # seq 1121:2640
06:05:09.977956 IP bad-len 0 # seq ..:4117
06:05:09.978313 IP bad-len 0 # seq ..:5693
06:05:09.978488 IP bad-len 0 # seq ..:7366
06:05:09.978674 IP bad-len 0 # seq ..:9679
06:05:09.978972 IP bad-len 0 # seq ..:11959
06:05:09.979217 IP bad-len 0 # seq ..:13482
06:05:09.979426 IP bad-len 0 # seq ..:15208
After 15k (15655 bytes to be exact), your system sends a smaller segment, probably due to window size, as none of its pushes have been ACKed yet
06:05:09.979645 IP 10.247.30.97.49691 > 54.231.169.13.443: Flags [P.], seq 15208:15655, ack 3225, win 257, length 447
ACKs start coming back from S3, and your system continues sending data as the window advances
06:05:10.151978 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 2553, win 81, length 0
06:05:10.152024 IP bad-len 0 # seq ..:18519
06:05:10.152247 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 4072, win 92, length 0
06:05:10.152263 IP bad-len 0 # seq ..:21383
06:05:10.152708 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 5549, win 104, length 0
06:05:10.152709 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 7125, win 126, length 0
06:05:10.152732 IP bad-len 0 # seq ..:28543
06:05:10.152921 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 8798, win 148, length 0
06:05:10.152937 IP bad-len 0 # seq ..:31407
06:05:10.153022 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 11111, win 165, length 0
06:05:10.153039 IP bad-len 0 # seq ..:36538
And, that's the last byte of data we see from your system: 36.5k. I think the file you're uploading is larger than that! S3 proceeds to ACK all of that data (which has been re-segmented by the network interface, hence the different sequence numbers here)
06:05:10.153663 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 13391, win 165, length 0
06:05:10.153663 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 15208, win 158, length 0
06:05:10.195567 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 15655, win 165, length 0
06:05:10.326343 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 17087, win 165, length 0
06:05:10.326344 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 18519, win 160, length 0
06:05:10.326344 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 19951, win 155, length 0
06:05:10.326344 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 21383, win 150, length 0
06:05:10.326928 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 22815, win 165, length 0
06:05:10.326928 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 25679, win 158, length 0
06:05:10.326929 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 28543, win 147, length 0
06:05:10.327210 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 31407, win 158, length 0
06:05:10.327287 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 34271, win 158, length 0
06:05:10.327288 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [.], ack 36538, win 150, length 0
By now your system has stopped sending data after byte 36538, and S3 has ACKed everything up to that byte. After exactly 15 seconds, S3 sends an HTTP response with the XML error message about the timeout that you eventually see onscreen:
06:05:25.326949 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [P.], seq 3225:3950, ack 36538, win 165, length 725
06:05:25.326950 IP 54.231.169.13.443 > 10.247.30.97.49691: Flags [P.], seq 3950:4003, ack 36538, win 165, length 53
Your system immediately ACK's that response and then hangs up the connection:
06:05:25.326987 IP 10.247.30.97.49691 > 54.231.169.13.443: Flags [.], ack 4003, win 253, length 0
06:05:25.331795 IP 10.247.30.97.49691 > 54.231.169.13.443: Flags [R.], seq 36538, ack 4003, win 0, length 0
So I suppose the question is, why would your system stop sending data after only 36.5k? It's not *hung*, as it responds to the error message immediately, so it must be in some kind of wait state.
The code that's sending the data is not complicated:
https://github.com/mozilla/build-tooltool/blob/master/tooltool.py#L775
def _s3_upload(filename, file):
# urllib2 does not support streaming, so we fall back to good old httplib
url = urlparse.urlparse(file['put_url'])
cls = httplib.HTTPSConnection if url.scheme == 'https' else httplib.HTTPConnection
host, port = url.netloc.split(':') if ':' in url.netloc else (url.netloc, 443)
port = int(port)
conn = cls(host, port)
try:
req_path = "%s?%s" % (url.path, url.query) if url.query else url.path
conn.request('PUT', req_path, open(filename),
{'Content-type': 'application/octet-stream'})
resp = conn.getresponse()
resp_body = resp.read()
conn.close()
I do see one potential issue, though: I didn't open in binary mode. Try changing `open(filename)` to `open(filename, "rb")`? I can see how that might cause an issue if, indeed, the file you're uploading is about 36k, as httplib would send a different content-length than the actual number of bytes transmitted. Want to try that?
Comment 6•8 years ago
|
||
Comment hidden (mozreview-request) |
![]() |
||
Comment 8•8 years ago
|
||
mozreview-review |
Comment on attachment 8809396 [details]
Bug 1316249: tooltool: specify binary mode in all open(..) calls;
https://reviewboard.mozilla.org/r/92016/#review92188
Attachment #8809396 -
Flags: review?(gps) → review+
Pushed by gszorc@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8d8337d3199c
tooltool: specify binary mode in all open(..) calls; r=gps
![]() |
||
Comment 10•8 years ago
|
||
bugherder |
Assignee | ||
Updated•8 years ago
|
Component: Tools → General
You need to log in
before you can comment on or make changes to this bug.
Description
•