Closed Bug 743856 Opened 13 years ago Closed 13 years ago

mozharness.base.python.query_python_package() can give false positives

Categories

(Release Engineering :: Applications: MozharnessCore, defect, P5)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mozilla, Unassigned)

Details

(Whiteboard: [mozharness])

Attachments

(1 file, 2 obsolete files)

From bug 697462 comment 19: > > packages = output.split() > > return [package for package in packages > > if package.lower().find(package_name.lower()) != -1] > > > > Not part of this patch but this could give false positives > > Hm. Well, it'll be unused soon, so if it's not good we can yank it. > Sounds like we need a MEGAREGEX! ;-) Again, somewhat OT for this bug, but casual inspection seems to indicate that if you * ignore lines starting with -e (maybe a few other flags, maybe 'if not line.startswith('-')' * split on '==', '<=', '>=' and take the first part you should be able to be pretty robust. But there might be some cases I don't know about. In any case, it would be better than what we have.
I wonder if we need/want this function. There should normally either be one match (the package is installed) or, in an odd case, multiple matches (there is more than one version of the package installed). Also, to correct what I said in comment 0, `pip freeze` will always freeze a precise version, so all we need to check for is '==' I believe for package lines
one solution: def pip_freeze_packages(self, pip_freeze_output): """reads packages from `pip freeze` output and returns a list""" packages = [] for line in pip_freeze_output.splitlines(): line = line.strip() if not line: continue if line.startswith('-'): # not a package, probably like '-e http://example.com/path#egg=package-dev' continue if '==' not in line: self.fatal("pip_freeze_packages: Unrecognized output line: %s" % line) packages.append(line) # append the whole line, package + version, e.g. Tempita==0.5.1 return packages
Attached patch fix + test (obsolete) — Splinter Review
So I'm not entirely sure what we actually want to do here. That is, do we want, as per the docstring, "Returns a list of all installed packages that contain package_name in their name", or should we go to just figuring out if the package is installed? Here is one stab at it plus a test
Attachment #613439 - Flags: review?(aki)
Comment on attachment 613439 [details] [diff] [review] fix + test (In reply to Jeff Hammel [:jhammel] from comment #3) > So I'm not entirely sure what we actually want to do here. Aiui, this is ahal's code that he used in peptest.py. I removed the only call to this function in bug 720436, so any effort expended here won't affect any code that's in use. There's a reason I split this bug off, for futuring. > That is, do we > want, as per the docstring, "Returns a list of all installed packages that > contain package_name in their name", or should we go to just figuring out if > the package is installed? I think the latter. > Here is one stab at it plus a test Using 'pip freeze' looks like a good approach. I may be wrong, but it looks like you're not actually passing the output from 'pip freeze' to self.pip_freeze_packages() ?
(In reply to Aki Sasaki [:aki] from comment #4) > Comment on attachment 613439 [details] [diff] [review] > fix + test > > (In reply to Jeff Hammel [:jhammel] from comment #3) > > So I'm not entirely sure what we actually want to do here. > > Aiui, this is ahal's code that he used in peptest.py. I removed the only > call to this function in bug 720436, so any effort expended here won't > affect any code that's in use. There's a reason I split this bug off, for > futuring. > > > That is, do we > > want, as per the docstring, "Returns a list of all installed packages that > > contain package_name in their name", or should we go to just figuring out if > > the package is installed? > > I think the latter. Cool, I will change the code to do so. Probably makes a lot more sense. > > Here is one stab at it plus a test > > Using 'pip freeze' looks like a good approach. > I may be wrong, but it looks like you're not actually passing the output > from 'pip freeze' to self.pip_freeze_packages() ? Hah! excellent catch! My test doesn't catch that for sure (Out of scope, but we should probably have more tests for VirtualenvMixin; I just put this in as a stub.)
Comment on attachment 613439 [details] [diff] [review] fix + test Will put up a better patch in a bit
Attachment #613439 - Flags: review?(aki) → review-
(In reply to Jeff Hammel [:jhammel] from comment #5) > (Out of scope, > but we should probably have more tests for VirtualenvMixin; I just put this > in as a stub.) I'd like more tests in general, especially for mozharness.base.* and mozharness.mozilla.* .
(In reply to Aki Sasaki [:aki] from comment #7) > (In reply to Jeff Hammel [:jhammel] from comment #5) > > (Out of scope, > > but we should probably have more tests for VirtualenvMixin; I just put this > > in as a stub.) > > I'd like more tests in general, especially for mozharness.base.* and > mozharness.mozilla.* . Well there's that too :)
Attached patch new API (obsolete) — Splinter Review
I've changed this quite a bit in hope to match more what we actually need. The new API deprecates pip_freeze_packages for reading the output and query_python_package for, to quote: """ Returns a list of all installed packages that contain package_name in their name """ Which IMHO is a footgun as 'foo' would match both 'foobar' and 'barfoo' (etc) I have replaced these with package_versions, which returns a dictionary of packages and their versions (via pip freeze, though it can also parse the output if it is parsed) and is_package_installed. I am happy to rename the latter to 'query_python_package' but since what it does is qualitatively different from the other query_ functions I figured I would title it differently.
Attachment #613439 - Attachment is obsolete: true
Attachment #614541 - Flags: review?(aki)
Comment on attachment 614541 [details] [diff] [review] new API >+ if pip_freeze_output is None: >+ # get the output from `pip freeze` >+ pip = self.query_python_path("pip") >+ if not pip: >+ self.log("query_python_package: Program pip not in path", level=error_level) Nit: self.log("package_versions: Program pip... I'm going to guess this is you indenting the previous code. >+ pip_freeze_output = self.get_output_from_command([pip, "freeze"], silent=True) >+ if not pip_freeze_output: >+ # no packages installed >+ return {} Wondering if we should log this. Not sure if it needs to be at error_level, but an info() or debug() would be nice. >+ if '==' not in line: >+ self.fatal("pip_freeze_packages: Unrecognized output line: %s" % line) Hm, maybe self.log(message, level=error_level) with a continue? But certainly if we don't ever expect to see this, it's harmless. Also, when grepping for 'query_python_package' in the repo, I found it in mozharness/mozilla/testing/testbase.py. I added this recently so it's understandable if you missed it. In general, when I change anything in mozharness, I try to make sure that I'm not breaking any other scripts by at least a 'grep -r' and eyeballing a fix. I'm wondering a bit about future name conflicts, since is_package_installed() isn't necessarily python-specific. But is_python_package_installed() is a bit verbose and it's not difficult to rename, so I'm going to leave it alone. This looks good; could you update the comment and change mozharness.mozilla.testing.testbase to call self.is_package_installed() ?
Attachment #614541 - Flags: review?(aki) → review+
Going to go with is_package_installed -> is_python_package_installed I'm also going to change this to be case insensitive e.g. currently is_python_package_installed('mozinstall') would not currently match https://github.com/mozilla/mozbase/blob/master/mozinstall/setup.py since the package name is 'mozInstall'. I figure killing case sensitivity is less of a footgun than having an exact match, plus is in parity with what pypi and easy_install do.
case-insensitive++
(In reply to Aki Sasaki [:aki] from comment #10) > Comment on attachment 614541 [details] [diff] [review] > new API > > >+ if pip_freeze_output is None: > >+ # get the output from `pip freeze` > >+ pip = self.query_python_path("pip") > >+ if not pip: > >+ self.log("query_python_package: Program pip not in path", level=error_level) > > Nit: self.log("package_versions: Program pip... > I'm going to guess this is you indenting the previous code. Thanks, indeed. Fixed. > >+ pip_freeze_output = self.get_output_from_command([pip, "freeze"], silent=True) > >+ if not pip_freeze_output: > >+ # no packages installed > >+ return {} > > Wondering if we should log this. Not sure if it needs to be at error_level, > but an info() or debug() would be nice. This is carried over code. Looking at the situation, this shouldn't happen. I've changed this to check if its a basestring, as the error condition is -1. I've also change get_output_from_command to return '' in the no-op case as this should be a passing case. (FWIW, -1 and a string of output are the only return values, so I could check for -1; however, the rest of the function is only valid for a string). > >+ if '==' not in line: > >+ self.fatal("pip_freeze_packages: Unrecognized output line: %s" % line) > > Hm, maybe self.log(message, level=error_level) with a continue? > But certainly if we don't ever expect to see this, it's harmless. I'd rather leave since if you get here something is wrong. If you're counting on this function to work, you may not be happy. > Also, when grepping for 'query_python_package' in the repo, I found it in > mozharness/mozilla/testing/testbase.py. I added this recently so it's > understandable if you missed it. In general, when I change anything in > mozharness, I try to make sure that I'm not breaking any other scripts by at > least a 'grep -r' and eyeballing a fix. Good catch, will endeavor to do so in the future (I also changed the package name to be mozInstall vs mozinstall...with the case-insensitivity it doesn't really matter, but in general module name and package name have only an informal relationship). > I'm wondering a bit about future name conflicts, since > is_package_installed() isn't necessarily python-specific. But > is_python_package_installed() is a bit verbose and it's not difficult to > rename, so I'm going to leave it alone. Yep; have gone with the latter. > This looks good; could you update the comment and change > mozharness.mozilla.testing.testbase to call self.is_package_installed() ? Yep. I've changed the code enough such that I'm going to ask for another review rather than carrying forward.
Attachment #614541 - Attachment is obsolete: true
Attachment #614582 - Flags: review?(aki)
Attachment #614582 - Flags: review?(aki) → review+
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → Mozharness
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: