Closed
Bug 1419577
Opened 7 years ago
Closed 4 years ago
+ character not supported in artifact file names
Categories
(Taskcluster :: Services, defect, P5)
Taskcluster
Services
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: glandium, Assigned: dustin)
References
Details
(Note the following artifact is going to expire in 28 days, so the following won't be reproducible past that date)
The following task has multiple artifacts with a + in their file name:
https://tools.taskcluster.net/groups/KGxjWaOITgaajaW9BD-USg/tasks/C5gWXuxeQia08zmiv5SUkQ/runs/0/artifacts
Downloading them ends with a 403 Access Denied error.
$ curl -svOL https://queue.taskcluster.net/v1/task/C5gWXuxeQia08zmiv5SUkQ/runs/0/artifacts/public/build/public/build/libpython2.7-dev_2.7.9-2+deb7moz1_amd64.deb
(snip)
> GET /v1/task/C5gWXuxeQia08zmiv5SUkQ/runs/0/artifacts/public/build/public/build/libpython2.7-dev_2.7.9-2+deb7moz1_amd64.deb HTTP/1.1
> Host: queue.taskcluster.net
> User-Agent: curl/7.56.1
> Accept: */*
>
{ [5 bytes data]
< HTTP/1.1 303 See Other
< Server: Cowboy
< Connection: keep-alive
< X-Powered-By: Express
< Strict-Transport-Security: max-age=7776000
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT
< Access-Control-Request-Method: *
< Access-Control-Allow-Headers: X-Requested-With,Content-Type,Authorization,Accept,Origin
< Location: https://public-artifacts.taskcluster.net/C5gWXuxeQia08zmiv5SUkQ/0/public/build/public/build/libpython2.7-dev_2.7.9-2+deb7moz1_amd64.deb
< Vary: Accept
< Content-Type: text/plain; charset=utf-8
< Content-Length: 29
< Date: Tue, 21 Nov 2017 22:11:32 GMT
< Via: 1.1 vegur
<
* Ignoring the response-body
{ [29 bytes data]
100 29 100 29 0 0 29 0 0:00:01 --:--:-- 0:00:01 32
* Connection #0 to host queue.taskcluster.net left intact
* Issue another request to this URL: 'https://public-artifacts.taskcluster.net/C5gWXuxeQia08zmiv5SUkQ/0/public/build/public/build/libpython2.7-dev_2.7.9-2+deb7moz1_amd64.deb'
(snip)
> GET /C5gWXuxeQia08zmiv5SUkQ/0/public/build/public/build/libpython2.7-dev_2.7.9-2+deb7moz1_amd64.deb HTTP/1.1
> Host: public-artifacts.taskcluster.net
> User-Agent: curl/7.56.1
> Accept: */*
>
{ [5 bytes data]
< HTTP/1.1 403 Forbidden
< Content-Type: application/xml
< Transfer-Encoding: chunked
< Connection: keep-alive
< Date: Tue, 21 Nov 2017 22:11:32 GMT
< Server: AmazonS3
< X-Cache: Error from cloudfront
< Via: 1.1 f5b2faea8f075526fbc66f630b06952f.cloudfront.net (CloudFront)
< X-Amz-Cf-Id: 5Cduf2hvSrx-9uP8MnesGmC6JX0EH2E_xSNHQUoT2Q2S5jyzt8C9bg==
<
{ [249 bytes data]
100 243 0 243 0 0 243 0 --:--:-- 0:00:01 --:--:-- 0
* Connection #1 to host public-artifacts.taskcluster.net left intact
As noted on IRC, the corresponding S3 url uses an escaped url with %2B instead of +, so
https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/C5gWXuxeQia08zmiv5SUkQ/0/public/build/public/build/libpython2.7-dev_2.7.9-2%2Bdeb7moz1_amd64.deb
works.
But replacing the + with %2B in the original queue url doesn't work either, because it still redirects to the public-artifacts url with a +.
Reporter | ||
Comment 1•7 years ago
|
||
I'll note that apt expects a + in the url, not a %2B. So for instance, the following source doesn't currently work:
deb [trusted=yes] http://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/C5gWXuxeQia08zmiv5SUkQ/0/public/build/public/ build/
Comment 2•7 years ago
|
||
So this is an issue with the old Artifact API. The issue is that S3 will always URL encode all object names, so it is actually being stored in S3 as 'https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/C5gWXuxeQia08zmiv5SUkQ/0/public/build/public/build/libpython2.7-dev_2.7.9-2%2Bdeb7moz1_amd64.deb'.
What we need to do is URL Encode the 'name' parameter when we generate the S3 url. Since S3 requires the URL encoded form, we probably also need to do the same for Cloud Front. I'll work on a patch.
From a machine in US-West-2 so that I could bypass cloudfront:
[ec2-user@ip-172-31-42-7 ~]$ curl -I https://queue.taskcluster.net/v1/task/C5gWXuxeQia08zmiv5SUkQ/runs/0/artifacts/public/build/public/build/libpython2.7-dev_2.7.9-2+deb7moz1_amd64.deb
HTTP/1.1 303 See Other
...
Location: https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/C5gWXuxeQia08zmiv5SUkQ/0/public/build/public/build/libpython2.7-dev_2.7.9-2+deb7moz1_amd64.deb
...
[ec2-user@ip-172-31-42-7 ~]$ curl -I https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/C5gWXuxeQia08zmiv5SUkQ/0/public/build/public/build/libpython2.7-dev_2.7.9-2+deb7moz1_amd64.deb
HTTP/1.1 403 Forbidden
...
[ec2-user@ip-172-31-42-7 ~]$ curl -I https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/C5gWXuxeQia08zmiv5SUkQ/0/public/build/public/build/libpython2.7-dev_2.7.9-2%2Bdeb7moz1_amd64.deb
HTTP/1.1 200 OK...
Comment 3•7 years ago
|
||
I have a patch up at https://github.com/taskcluster/taskcluster-queue/pull/233
Comment 4•7 years ago
|
||
Mike, is the intention to serve an apt repository from queue.taskcluster.net urls? Also, did you mean to have the artifact name be "public/build/public/build/..."?
Flags: needinfo?(mh+mozilla)
Updated•7 years ago
|
Assignee: nobody → jhford
Reporter | ||
Comment 5•7 years ago
|
||
(In reply to John Ford [:jhford] CET/CEST Berlin Time from comment #4)
> Mike, is the intention to serve an apt repository from queue.taskcluster.net
> urls?
Yes
> Also, did you mean to have the artifact name be
> "public/build/public/build/..."?
No, that's something I fixed.
Flags: needinfo?(mh+mozilla)
Updated•7 years ago
|
Blocks: tc-stability
Comment 6•7 years ago
|
||
Commit pushed to master at https://github.com/taskcluster/taskcluster-queue
https://github.com/taskcluster/taskcluster-queue/commit/be13f1899e2a5a8541ef9c08c685424bb4cf1707
bug 1419577 -- URL Encode the prefix portion of S3 urls
It turns out that regardless of what we put into the prefix, S3 will URL
Encode some characters of the prefix (e.g. + will always be encoded to
%2B). This results in broken URLs. This approach is the simple
approach of just encoding the entire S3 Object Key. I've verified that
with a problem URL (e.g. has + in it) that this encoding results in
valid URLS which get 200 from S3 instead of 403s. I've also verified
that Cloudfront also supports this encoding.
Another approach for prettier URLs would be to revert the %2F character
back to a / character since S3 and CloudFront already deal with those
characters quite well. This would require
prefix = encodeURIComponent(prefix).replace(/%2f/gi, "/");
Updated•6 years ago
|
Updated•6 years ago
|
Component: Queue → Services
Updated•6 years ago
|
Assignee: jhford → nobody
Updated•6 years ago
|
QA Contact: jhford
Assignee | ||
Comment 7•4 years ago
|
||
I'll check that the object service gets this right, or if it's an issue with S3, at least documents the issue.
Assignee: nobody → dustin
Assignee | ||
Comment 8•4 years ago
|
||
Yup, S3's broken.
Assignee | ||
Comment 9•4 years ago
|
||
OK, OK, it isn't broken, we were just not encoding +
.
Assignee | ||
Updated•4 years ago
|
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 10•4 years ago
|
||
The fix in the object service was to use encodeURIComponent
instead of encodeURI
. But, the object service isn't in use yet, so this doesn't actually fix the issue for users.
Looking at the queue's implementation of s3
artifacts, which are still in use, we appear not to encode at all. I'm not at all confident that trying to change that wouldn't break some existing assumptions about artifacts, so I'm hesitant to try to fix it, rather than considering this fixed when we're using objects for artifacts.
You need to log in
before you can comment on or make changes to this bug.
Description
•