Closed
Bug 1364463
Opened 8 years ago
Closed 8 years ago
Bad objects being cached in cloud mirror
Categories
(Taskcluster :: General, defect)
Taskcluster
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: catlee, Assigned: jhford)
References
Details
In https://treeherder.mozilla.org/logviewer.html#?job_id=98688466&repo=mozilla-central&lineNumber=145, we failed to download a partial update from TC. The relevant lines are:
2017-05-12 12:00:09,824 - INFO - Downloading https://queue.taskcluster.net/v1/task/OuO2Yft0Qpu5_4yEXUFZjQ/artifacts/public/env/Firefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar to /tmp/tmptpwz2Z...
2017-05-12 12:00:09,824 - DEBUG - attempt 1/5
2017-05-12 12:00:09,824 - DEBUG - retry: Calling <function download at 0x7fc3bcaf78c0> with args: ('https://queue.taskcluster.net/v1/task/OuO2Yft0Qpu5_4yEXUFZjQ/artifacts/public/env/Firefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar', '/tmp/tmptpwz2Z'), kwargs: {}, attempt #1
2017-05-12 12:00:09,825 - DEBUG - Downloading https://queue.taskcluster.net/v1/task/OuO2Yft0Qpu5_4yEXUFZjQ/artifacts/public/env/Firefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar to /tmp/tmptpwz2Z
2017-05-12 12:00:09,828 - INFO - Starting new HTTPS connection (1): queue.taskcluster.net
2017-05-12 12:00:10,079 - DEBUG - "GET /v1/task/OuO2Yft0Qpu5_4yEXUFZjQ/artifacts/public/env/Firefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar HTTP/1.1" 303 29
2017-05-12 12:00:10,081 - INFO - Starting new HTTPS connection (1): cloud-mirror.taskcluster.net
2017-05-12 12:00:16,459 - DEBUG - "GET /v1/redirect/s3/us-east-1/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Ftaskcluster-public-artifacts%2FOuO2Yft0Qpu5_4yEXUFZjQ%2F0%2Fpublic%2Fenv%2FFirefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar HTTP/1.1" 302 301
2017-05-12 12:00:16,460 - INFO - Starting new HTTPS connection (1): cloud-mirror-production-us-east-1.s3.amazonaws.com
2017-05-12 12:00:16,504 - DEBUG - "GET /https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Ftaskcluster-public-artifacts%2FOuO2Yft0Qpu5_4yEXUFZjQ%2F0%2Fpublic%2Fenv%2FFirefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar HTTP/1.1" 200 282
2017-05-12 12:00:16,505 - DEBUG - Downloaded 282 bytes
2017-05-12 12:00:16,505 - DEBUG - Content-Length: 282 bytes
The original artifact as fetched from queue.tc.net looks ok:
curl -iL https://queue.taskcluster.net/v1/task/OuO2Yft0Qpu5_4yEXUFZjQ/ar
tifacts/public/env/Firefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.p
artial.mar HTTP/1.1 303 See Other
Server: Cowboy
Connection: keep-alive X-Powered-By: Express
Strict-Transport-Security: max-age=7776000 I
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT
Access-Control-Request-Method: *
Access-Control-Allow-Headers: X-Requested-With,Content-Type,Authorization,Accept,Origin
Location: https://public-artifacts.taskcluster.net/OuO2Yft0Qpu5_4yEXUFZjQ/0/public/env/Firefox-
mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar
Vary: Accept
Content-Type: text/plain; charset=utf-8
Content-Length: 29
Date: Fri, 12 May 2017 15:15:25 GMT
Via: 1.1 vegur
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 16954022 Connection: keep-alive
Date: Fri, 12 May 2017 15:15:28 GMT
Last-Modified: Fri, 12 May 2017 11:53:32 GMT
ETag: "13c2fca56a66fc69e9e5c6842102d1a7"
x-amz-version-id: qxFDZLatR5t5JPcmEOuCBp39xce0Bvj.
Accept-Ranges: bytes
Server: AmazonS3
X-Cache: Miss from cloudfront
Via: 1.1 392869124c677c4f82415d8ce2dcdd73.cloudfront.net (CloudFront)
X-Amz-Cf-Id: IDTZ0dYFwS2H4bVdte-l3ksMdtox1otfUFDiK0hH5RAKe5b8_8ZsNg==
But fetching the cached version from cloud mirror fails:
curl -iL https://cloud-mirror.taskcluster.net/v1/redirect/s3/us-east-1/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Ftaskcluster-public-artifacts%2FOuO2Yft0Qpu5_4yEXUFZjQ%2F0%2Fpublic%2Fenv%2FFirefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar
HTTP/1.1 302 Found
Server: Cowboy
Connection: keep-alive
X-Powered-By: Express
Strict-Transport-Security: max-age=7776000
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT
Access-Control-Request-Method: *
Access-Control-Allow-Headers: X-Requested-With,Content-Type,Authorization,Accept,Origin
Location: https://cloud-mirror-production-us-east-1.s3.amazonaws.com/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Ftaskcluster-public-artifacts%2FOuO2Yft0Qpu5_4yEXUFZjQ%2F0%2Fpublic%2Fenv%2FFirefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar
Content-Type: application/json; charset=utf-8
Content-Length: 301
Etag: W/"12d-ea1202a0"
Date: Fri, 12 May 2017 15:16:13 GMT
Via: 1.1 vegur
HTTP/1.1 200 OK
x-amz-id-2: rKtg9RE+ueB3airfWkBLARA4VpGofIoSpYTYjab2KjCJqQ4dsIoof9M+FZsYOX79IP5bwLH024g=
x-amz-request-id: EC1BF649940A6CF3
Date: Fri, 12 May 2017 15:16:14 GMT
Last-Modified: Fri, 12 May 2017 12:00:17 GMT
x-amz-expiration: expiry-date="Sun, 14 May 2017 00:00:00 GMT", rule-id="us-east-1-1-day"
ETag: "7b65e6af6641ddd18bbff71e39eba530"
x-amz-meta-cloud-mirror-upstream-url: https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/OuO2Yft0Qpu5_4yEXUFZjQ/0/public/env/Firefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar
x-amz-meta-cloud-mirror-upstream-content-length: <unknown>
x-amz-meta-cloud-mirror-stored: 2017-05-12T12:00:15.839Z
x-amz-meta-cloud-mirror-upstream-etag: <unknown>
x-amz-meta-cloud-mirror-addresses: [{"c":200,"u":"https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/OuO2Yft0Qpu5_4yEXUFZjQ/0/public/env/Firefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar","t":"2017-05-12T12:00:10.717Z"}]
Accept-Ranges: bytes
Content-Type: application/xml
Content-Length: 282
Server: AmazonS3
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><RequestId>F177343936D4C750</RequestId><HostId>qsJdhOO7uH65Ug9v+nMMSBGlkJslzGJcvjMiZH6HWyMGTo44U4/RjMtV0bkPgGs3HE1dHO4mNFY=</HostId></Error>
Assignee | ||
Comment 1•8 years ago
|
||
This happens infrequently. I've purged the problem file. Just so that you know, there's a cache purging endpoint
curl -i -X DELETE https://cloud-mirror.taskcluster.net/v1/purge/s3/us-east-1/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Ftaskcluster-public-artifacts%2FOuO2Yft0Qpu5_4yEXUFZjQ%2F0%2Fpublic%2Fenv%2FFirefox-mozilla-central-55.0a1-linux64-es-MX-20170510183715-20170512100218.partial.mar
Assignee | ||
Comment 2•8 years ago
|
||
Is this a consistent problem or the usually very infrequent one?
Assignee: nobody → jhford
Assignee | ||
Comment 3•8 years ago
|
||
It looks like this is not recurring. Please feel free to open a new bug if that's not the case. Generally speaking, this corruption has been caught around 3-4 times in the last two years, and until we have the new SHA256 stuff in the queue, it's pretty difficult for this to be detected and remedied automatically. I'm going to mark this bug as FIXED because the invalid file was successfully purged from the cache.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 4•8 years ago
|
||
I suspect we only notice it when it breaks nightly or release updates.
Should automation be modified to purge the cache in case of errors?
Assignee | ||
Comment 5•8 years ago
|
||
Ideally yes, but when the new artifact api lands we'll be able to detect whether the file is the one that was intended to be created. I intend to also add some headers into cloud-mirror based files which will let us know whether the corruption occured there.
The ideal outcome would be that cloud-mirror would be able to use the x-amz-meta-{content,transfer}-sha256 values to detect invalid transfer and reject the transfer.
You need to log in
before you can comment on or make changes to this bug.
Description
•