Closed
Bug 1295948
Opened 8 years ago
Closed 7 years ago
query_build_dir_url() doesn't support encoded characters in build URL
Categories
(Release Engineering :: Applications: MozharnessCore, defect)
Release Engineering
Applications: MozharnessCore
Tracking
(firefox53 fixed)
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox53 | --- | fixed |
People
(Reporter: whimboo, Assigned: svezauzeto12, Mentored)
Details
(Whiteboard: [lang=py])
Attachments
(1 file, 1 obsolete file)
query_build_dir_url() [1] is doing some magic to find test packages as referenced in the test_packages.json file. It searches for the last '/' and replaces the remaining string with the appropriate test file. As I just noticed this doesn't work if the '/' is HTML encoded. Here an example: https://queue.taskcluster.net/v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public%2Fbuild%2Ftarget.test_packages.json With this above logic we end up with a common.tests.zip file as: https://queue.taskcluster.net/v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/common.tests.zip It strips off the 'public/build/' path which forces TC to ask for permissions to download the file, which ends-up with a 403 forbidden failure. We should hardening the method so it can also cope with encoded slashes. [1] https://dxr.mozilla.org/mozilla-central/rev/fe895421dfbe1f1f8f1fc6a39bb20774423a6d74/testing/mozharness/mozharness/mozilla/testing/testbase.py#150
Comment 1•8 years ago
|
||
I'm guessing that instead of looking for '/' in the url, we should use something that already parses urls like urlparse, and urldecode the path if we want to allow for encoded slashes.
Reporter | ||
Comment 2•8 years ago
|
||
Sounds like a good idea, and I feel it would be great for a mentored project. Do you want to take that Aki?
Reporter | ||
Updated•8 years ago
|
Whiteboard: [lang=py]
Assignee | ||
Comment 4•7 years ago
|
||
Can I work on this ? Thanks
Comment 5•7 years ago
|
||
Sure. It looks like you've already contributed patches; let me know if you have any questions or need any guidance to write this patch.
Assignee | ||
Comment 6•7 years ago
|
||
If I understood correctly I should just decode url so it goes from : https://queue.taskcluster.net/v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public%2Fbuild%2Ftarget.test_packages.json to : https://queue.taskcluster.net/v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public/build/target.test_packages.json than can be done ( as you mentioned above ) with urllib2.unquote(our_url) after that I should parse url and return url path, which in our case should be: /v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public/build/target.test_packages.json and then I should strip "target.test_packages.json" so filename can be added. We should get this: /v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public/build (/filename is added after build ) Is that correct ? If it is please review this commit. It is possible that I have made error even if the above assumption is correct, I have never worked on this large project before. I have added comments in code so you follow my thoughts-slow more easily, we can just remove them later when you are satisfied with commit. Thanks.
Assignee | ||
Comment 7•7 years ago
|
||
error when trying to push commit: error publishing review request 95750: Error publishing: Bugzilla error: Aki Sasaki [:aki] (PTO, back Nov28) <aki@mozilla.com> is not currently accepting 'review' requests. (HTTP 500, API Error 225) (review requests not published; visit review url to attempt publishing there)
Comment 8•7 years ago
|
||
I just unblocked review requests. Try again?
Comment hidden (mozreview-request) |
Assignee | ||
Comment 10•7 years ago
|
||
commited, thanks :)
Comment 11•7 years ago
|
||
mozreview-review |
Comment on attachment 8814566 [details] Bug 1295948 - fixing method so it works with encoded URLs https://reviewboard.mozilla.org/r/95752/#review95792 r- because this patch changes the return value from a url to a path. In the review comments, I've shown a couple ways we can do this. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:169 (Diff revision 1) > > + #first we decode url > + reference_url = urllib2.unquote(reference_url) > + > + #now we parse url so we can extract path > + reference_url = urlparse(reference_url) Here you overwrite the reference_url string with a `urlparse.ParseResult`. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:172 (Diff revision 1) > + > + #now we parse url so we can extract path > + reference_url = urlparse(reference_url) > + > + #now we should extract path to string so we can find last slash > + reference_url = reference_url.path Here you overwrite `reference_url` with just the path. So if `reference_url` was once `https://server.name/path/to/artifact`, it is now `/path/to/artifact`. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:177 (Diff revision 1) > + reference_url = reference_url.path > + > + # and now this should work even with encoded urls ( which are not encoded anymore > + # when they come by the time they come here > last_slash = reference_url.rfind('/') > base_url = reference_url[:last_slash] Since `reference_url` no longer has the scheme or netloc (or any other url information, just the path), `base_url` will also be just a path. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:179 (Diff revision 1) > + # and now this should work even with encoded urls ( which are not encoded anymore > + # when they come by the time they come here > last_slash = reference_url.rfind('/') > base_url = reference_url[:last_slash] > > return '%s/%s' % (base_url, file_name) Now, since `base_url` is just a path, we're returning a path instead of a url. I can think of two things that might work: 1. remove the urlparse lines (lines 168-172). That means we'll just unquote and then remove everything in the url from the rightmost `/` onwards as the `base_url`, and add the filename. This is not proper behavior, because a url can have lots of info post-filename (e.g., https://server.name/path/to/artifact?query=string&query2=string2#anchor) but I don't think we use queries or anchors in the artifact urls, so this will probably be sufficient. 2. Instead of saying `reference_url = urlparse(reference_url)` and `reference_url = reference_url.path`, you can do something like ```python from urlparse import urlparse, ParseResult reference_url = "https://server/path/to/artifact?query=string#anchor" file_name = "new_artifact" # parts will be a list from the ParseResult: # ['https', 'server', '/path/to/artifact', '', 'query=string', 'anchor'] parts = list(urlparse(reference_url)) last_slash = parts[2].rfind('/') parts[2] = '/'.join([parts[2][:last_slash], file_name]) print ParseResult(*parts).geturl() ``` which changes `https://server/path/to/artifact?query=string#anchor` to `https://server/path/to/new_artifact?query=string#anchor` and is the proper behavior. Does that help?
Attachment #8814566 -
Flags: review?(aki) → review-
Updated•7 years ago
|
Assignee: nobody → svezauzeto12
Assignee | ||
Comment 12•7 years ago
|
||
I have made new commit based on what we discussed. Hope it is working now, I have made few small tests in my own script but I am still not 100% sure that it is correct. I have made changes according to your 2. method which you described above. Thanks
Comment hidden (mozreview-request) |
Comment 14•7 years ago
|
||
Thanks! There's at least one nit. I triggered https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b2c13ed853747fdf554fb0a2cddd200825ff2de to verify this works.
Comment 15•7 years ago
|
||
mozreview-review |
Comment on attachment 8814732 [details] Bug 1295948 - fixing method so it works with encoded URLs, changed return method https://reviewboard.mozilla.org/r/95826/#review96166 One small nit, below. I'd like to see a try run finish with a unit test and a talos test successfully passing this part. I triggered a new one because the old one wasn't cooperating for some reason: https://treeherder.mozilla.org/#/jobs?repo=try&revision=63db69e1f268f3ad22f99d17a3a75244f3a3953e Using a small python script for testing was a good approach! I'm pretty sure this is good, but I'd rather be on the safe side. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:174 (Diff revision 1) > - # and now this should work even with encoded urls ( which are not encoded anymore > - # when they come by the time they come here > - last_slash = reference_url.rfind('/') > - base_url = reference_url[:last_slash] > > - return '%s/%s' % (base_url, file_name) > + return '%s' %url You only need to `return url` :) The '%s' is string replacement. It was there to join the `base_url` and `file_name` earlier, but we don't need that.
Attachment #8814732 -
Flags: review?(aki) → review-
Comment 16•7 years ago
|
||
Looks like the try run looks good, so the return statement is the only thing that needs fixing. Thank you!
Assignee | ||
Comment 17•7 years ago
|
||
I think this should be it. Thanks for mentoring.
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8814732 -
Attachment is obsolete: true
Comment 19•7 years ago
|
||
mozreview-review |
Comment on attachment 8814566 [details] Bug 1295948 - fixing method so it works with encoded URLs https://reviewboard.mozilla.org/r/95752/#review96524 Thanks!
Attachment #8814566 -
Flags: review?(aki) → review+
Comment 20•7 years ago
|
||
Pushed by asasaki@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/46a0a73fde9a fixing method so it works with encoded URLs r=aki
Comment 21•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/46a0a73fde9a
You need to log in
before you can comment on or make changes to this bug.
Description
•