Closed
Bug 1229908
Opened 7 years ago
Closed 7 years ago
ScriptMixin._urlopen() has to use quoted URL to not fail if spaces are contained
Categories
(Release Engineering :: Applications: MozharnessCore, defect)
Release Engineering
Applications: MozharnessCore
Tracking
(firefox45 fixed)
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox45 | --- | fixed |
People
(Reporter: whimboo, Assigned: whimboo)
Details
Attachments
(1 file)
1.13 KB,
patch
|
jlund
:
review+
|
Details | Diff | Splinter Review |
I have seen this today when processing build notifications for beta release candidate builds. The build url contained in those messages for builds on OS X contain a space and this breaks mozharness for downloading the build.
> 14:02:21 INFO - retry: Calling _download_file with args: (), kwargs: {'url': 'http://archive.mozilla.org/pub/firefox/candidates/43.0b8-candidates/build1/mac/en-US/Firefox 43.0b8.dmg', 'file_name': '/mozilla/code/gecko/testing/mozharness/build/Firefox 43.0b8.dmg'}, attempt #1
> 14:02:21 WARNING - Server returned status 404 HTTP Error 404: Not Found for http://archive.mozilla.org/pub/firefox/candidates/43.0b8-candidates/build1/mac/en-US/Firefox 43.0b8.dmg
If the URL is not properly quoted, mozharness should ensure to do it.
Assignee | ||
Comment 1•7 years ago
|
||
Over on bug 1229761 Chris mentioned a difference between urllib and urllib2: >>> import urllib2 >>> urllib2.urlopen('http://archive.mozilla.org/pub/firefox/candidates/43.0b8-candidates/build1/mac/en-US/Firefox 43.0b8.dmg') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python2.7/urllib2.py", line 437, in open response = meth(req, response) File "/usr/lib/python2.7/urllib2.py", line 550, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python2.7/urllib2.py", line 475, in error return self._call_chain(*args) File "/usr/lib/python2.7/urllib2.py", line 409, in _call_chain result = func(*args) File "/usr/lib/python2.7/urllib2.py", line 558, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) urllib2.HTTPError: HTTP Error 404: Not Found >>> import urllib >>> urllib.urlopen('http://archive.mozilla.org/pub/firefox/candidates/43.0b8-candidates/build1/mac/en-US/Firefox 43.0b8.dmg') <addinfourl at 139904123534656 whose fp = <socket._fileobject object at 0x7f3df795edd0>> So urllib.urlopen() works and seems to internally quote the url which is not done by urllib2.urlopen().
Assignee | ||
Comment 2•7 years ago
|
||
This is actually a bug in urllib2.urlopen() which has not been fixed yet: http://bugs.python.org/issue13359
Assignee | ||
Comment 3•7 years ago
|
||
Would this addition be an option for us? https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/base/script.py#303 > def _urlopen(self, url, **kwargs): > [..] >+ # http://bugs.python.org/issue13359 - urllib2 does not automatically quote the URL yet >+ url_quoted = urllib2.quote(url, safe='%/:=&?~#+!$,;\'@()*[]|') > return urllib2.urlopen(url_quoted, **kwargs)
Flags: needinfo?(jlund)
Flags: needinfo?(catlee)
Comment 4•7 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #3) > Would this addition be an option for us? > > https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/ > base/script.py#303 > > > def _urlopen(self, url, **kwargs): > > [..] > >+ # http://bugs.python.org/issue13359 - urllib2 does not automatically quote the URL yet > >+ url_quoted = urllib2.quote(url, safe='%/:=&?~#+!$,;\'@()*[]|') > > return urllib2.urlopen(url_quoted, **kwargs) seems reasonable to me. where did you get the string safe value from?
Flags: needinfo?(jlund)
Assignee | ||
Comment 5•7 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #4) > > > def _urlopen(self, url, **kwargs): > > > [..] > > >+ # http://bugs.python.org/issue13359 - urllib2 does not automatically quote the URL yet > > >+ url_quoted = urllib2.quote(url, safe='%/:=&?~#+!$,;\'@()*[]|') > > > return urllib2.urlopen(url_quoted, **kwargs) > > seems reasonable to me. where did you get the string safe value from? From one of the patches attached on the given Python bug report. Not sure if it is complete or if we would ever need all those safe values. But it should be better as what we have now and more stable as when we would try to make use of urllib.urlopen() here if even possible at all. As lesser changes we have as more confident I would be to not accidentally break something.
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Flags: needinfo?(catlee)
Summary: download_file should properly use quoted URLs to not fail if spaces are contained → ScriptMixin._urlopen() has to use quoted URL to not fail if spaces are contained
Assignee | ||
Comment 6•7 years ago
|
||
Attachment #8695223 -
Flags: review?(jlund)
Assignee | ||
Comment 7•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ab58e4d3e34d
Comment 8•7 years ago
|
||
Comment on attachment 8695223 [details] [diff] [review] Patch v1 Review of attachment 8695223 [details] [diff] [review]: ----------------------------------------------------------------- thanks for this
Attachment #8695223 -
Flags: review?(jlund) → review+
Comment 10•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f3741d47fb6b
You need to log in
before you can comment on or make changes to this bug.
Description
•