Closed Bug 1306040 Opened 8 years ago Closed 8 years ago

queue redirecting to cloudfront for EC2 instances

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: dustin, Unassigned)

References

Details

Attachments

(1 file)

Looking just at two hours of logs from the https://taskcluster-public-artifacts.taskcluster.net cloudformation logs, for distinct source IPs, I get about 2.4k distinct EC2 addresses, in both us-east-1 and us-west-2. From https://github.com/taskcluster/taskcluster-queue/blob/master/src/artifacts.js#L358 it appears that all EC2 requests should be going either directly to the s3 bucket URL or to cloud-mirror. The region metadata is dated 2016-09-26-16-49-06, and these two hours were the 3:00 hour UTC on the 27th and 28th, specifically logfiles EXGGJTH3KS8NS.2016-09-27-03.1998f3eb EXGGJTH3KS8NS.2016-09-27-03.f2a2f8ae EXGGJTH3KS8NS.2016-09-28-03.3f5b4a34 EXGGJTH3KS8NS.2016-09-28-03.c91eb9fd EXGGJTH3KS8NS.2016-09-27-03.234a9959 EXGGJTH3KS8NS.2016-09-27-03.f311188a EXGGJTH3KS8NS.2016-09-28-03.4a18334b EXGGJTH3KS8NS.2016-09-28-03.dd17876b EXGGJTH3KS8NS.2016-09-27-03.9f5ff394 EXGGJTH3KS8NS.2016-09-27-03.fe2305ac EXGGJTH3KS8NS.2016-09-28-03.8362d369 EXGGJTH3KS8NS.2016-09-27-03.ae0b2232 EXGGJTH3KS8NS.2016-09-28-03.18f6d377 EXGGJTH3KS8NS.2016-09-28-03.9f970f11 so the regions should not have changed since that time.
I can confirm this manually from lamport, which is in us-west-2: (sandbox) dustin@lamport ~/p/m-c (bug1269443) $ curl -L -v -o /dev/null https://queue.taskcluster.net/v1/task/SDlWxl5OQTOX9u4ZNytuTA/artifacts/public/build/target.reftest.tests.zip % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 54.235.141.80... * Connected to queue.taskcluster.net (54.235.141.80) port 443 (#0) * found 173 certificates in /etc/ssl/certs/ca-certificates.crt * found 692 certificates in /etc/ssl/certs * ALPN, offering http/1.1 * SSL connection using TLS1.2 / ECDHE_RSA_AES_128_GCM_SHA256 * server certificate verification OK * server certificate status verification SKIPPED * common name: auth.taskcluster.net (matched) * server certificate expiration date OK * server certificate activation date OK * certificate public key: RSA * certificate version: #3 * subject: C=US,ST=California,L=Mountain View,O=Mozilla Corporation,CN=auth.taskcluster.net * start date: Thu, 17 Mar 2016 00:00:00 GMT * expire date: Fri, 22 Mar 2019 12:00:00 GMT * issuer: C=US,O=DigiCert Inc,CN=DigiCert SHA2 Secure Server CA * compression: NULL * ALPN, server did not agree to a protocol 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0> GET /v1/task/SDlWxl5OQTOX9u4ZNytuTA/artifacts/public/build/target.reftest.tests.zip HTTP/1.1 > Host: queue.taskcluster.net > User-Agent: curl/7.47.0 > Accept: */* > < HTTP/1.1 303 See Other < Server: Cowboy < Connection: keep-alive < X-Powered-By: Express < Strict-Transport-Security: max-age=7776000 < Access-Control-Allow-Origin: * < Access-Control-Allow-Methods: OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT < Access-Control-Request-Method: * < Access-Control-Allow-Headers: X-Requested-With,Content-Type,Authorization,Accept,Origin < Location: https://public-artifacts.taskcluster.net/SDlWxl5OQTOX9u4ZNytuTA/0/public/build/target.reftest.tests.zip < Vary: Accept < Content-Type: text/plain; charset=utf-8 < Content-Length: 29 < Date: Wed, 28 Sep 2016 16:36:43 GMT < Via: 1.1 vegur < * Ignoring the response-body { [29 bytes data] 100 29 100 29 0 0 34 0 --:--:-- --:--:-- --:--:-- 34 * Connection #0 to host queue.taskcluster.net left intact * Issue another request to this URL: 'https://public-artifacts.taskcluster.net/SDlWxl5OQTOX9u4ZNytuTA/0/public/build/target.reftest.tests.zip' * Trying 52.84.232.137... * Connected to public-artifacts.taskcluster.net (52.84.232.137) port 443 (#1) * found 173 certificates in /etc/ssl/certs/ca-certificates.crt * found 692 certificates in /etc/ssl/certs * ALPN, offering http/1.1 * SSL connection using TLS1.2 / ECDHE_RSA_AES_128_GCM_SHA256 * server certificate verification OK * server certificate status verification SKIPPED * common name: auth.taskcluster.net (matched) * server certificate expiration date OK * server certificate activation date OK * certificate public key: RSA * certificate version: #3 * subject: C=US,ST=California,L=Mountain View,O=Mozilla Corporation,CN=auth.taskcluster.net * start date: Thu, 17 Mar 2016 00:00:00 GMT * expire date: Fri, 22 Mar 2019 12:00:00 GMT * issuer: C=US,O=DigiCert Inc,CN=DigiCert SHA2 Secure Server CA * compression: NULL * ALPN, server accepted to use http/1.1 > GET /SDlWxl5OQTOX9u4ZNytuTA/0/public/build/target.reftest.tests.zip HTTP/1.1 > Host: public-artifacts.taskcluster.net > User-Agent: curl/7.47.0 > Accept: */* > < HTTP/1.1 200 OK < Content-Type: application/zip < Content-Length: 32102631 < Connection: keep-alive < Date: Wed, 28 Sep 2016 15:12:33 GMT < Last-Modified: Wed, 28 Sep 2016 03:45:18 GMT < ETag: "098bdf06f370a606a75637840c4c0b3d" < x-amz-version-id: jwmF11fp84D3PjyPIbbVORt_v1ecEQvR < Accept-Ranges: bytes < Server: AmazonS3 < Age: 5051 < X-Cache: Hit from cloudfront < Via: 1.1 336f0e6ef9a3462f682d6ca49029b665.cloudfront.net (CloudFront) < X-Amz-Cf-Id: TdO_6aMqDx0H3Eg1X_bj91Umono9VRWe00bjTKAvj1iL-jO3S-2nZw== < { [16384 bytes data] 100 30.6M 100 30.6M 0 0 18.8M 0 0:00:01 0:00:01 --:--:-- 66.8M * Connection #1 to host public-artifacts.taskcluster.net left intact I suspect this is, at the least, an opportunity for cost savings, but may also be related to bug 1305752 and bug 1305768.
See Also: → 1305768, 1305752
Summary: queue redirecting to cloudfront for EC2 instances in us-west-2 → queue redirecting to cloudfront for EC2 instances
Attached file regions.txt
The 2438 distinct IPs I found, tagged with region. I downloaded and unzipped the CF logs, then cut -d' ' -f 5 * | sort -u > ips then ran import requests import json from IPy import * ips = [IP(ip.strip()) for ip in open("ips")] ranges = requests.get('https://ip-ranges.amazonaws.com/ip-ranges.json').json() ranges = [(IP(pfx['ip_prefix']), pfx['region']) for pfx in ranges['prefixes']] for ip in ips: region = 'unknown' for pfx, rgn in ranges: if ip in pfx: region = rgn break print region, ip
I submitted https://github.com/taskcluster/taskcluster-queue/pull/120 to help figure out what the queue's view of the requester is
I merged above PR that should fix us-west-2 and give us some more debug statements. I suspect we ought to refactor, but really, we want to move region resolution into cloud-mirror.
Dustin says I can close this.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
Component: Queue → Services
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: