Closed Bug 939350 Opened 6 years ago Closed 5 years ago

Create python linter wrapper

Categories

(Core :: General, defect)

defect
Not set

Tracking

()

RESOLVED WONTFIX

People

(Reporter: mcomella, Assigned: mcomella)

References

(Blocks 1 open bug)

Details

(Whiteboard: [mach])

Attachments

(2 files, 4 obsolete files)

It should probably be linter agnostic and useful in other parts of the tree.

Note that this command may need an option to deal with pre-processing in some way, probably by linting the compiled versions of the files.
Should this be moved to wherever mach bugs get filed?
Component: General → mach
Product: Firefox for Android → Core
Bugs for individual mach commands belong in whatever component the tool works on, not Core:Mach. For lack of a code hygiene or community-improvement component this could certainly live in Core:General.

This would be really valuable to the project. Right now I'd love us to focus on the linters that can work directly on the code (without requiring any compilation).

Do we have lint tools/scripts already?
There are multiple pieces to this request.

First, what does lint test? Are you talking full tree linting? Partial tree/individual file linting? Both?

Do you need integration with version control to only lint changed/affected files? If so, linting is probably best facilitated via a hook or additional command rather than a mach command.

Linting will require a lot of extra libraries that aren't in the tree currently. Do we wish to check them into the tree? Keep in mind we can initially store things in ~/.mozbuild (mach can download them from the internets until they are added to the tree). This would facilitate prototyping without littering the tree.

At the core of this feature, I think we're talking about building a "linting API." Probably some Python module that you feed it a directory or list of files to and it invokes the appropriate linter. We can start with just Java if that's what you want.

Also keep in mind prior art at Mozilla:

* We have a Clang plugin for performing static analysis. In theory it can enforce style. Although Clang likely has better tools we can use.
* My "mozext" Mercurial extension style checks Python via hooks and an `hg critic` command (https://hg.mozilla.org/users/gszorc_mozilla.com/hgext-gecko-dev)
Component: mach → General
Whiteboard: [mach]
http://www.johnkeiser.com/cgi-bin/jst-review-cgi.pl is the old-time example. Worse is better. If we can solve hard tabs, windows newlines, trailing whitespace, snuggled if, and such, that's still way better than what we have today.
Attached file jst-review.tar.gz
Archived from jkeiser's site, in case it goes down again.
JST's review script shouldn't be needed. And I even heard it doesn't work "right" any more.

We care about 3 languages:

- Python style checking is a solved problem (pyflakes, pep8, flake8, mccabe).

- Java style checking is a solved problem (too many tools to list).

- C++ is a bit harder, but tools have been written. We can easily get low hanging fruit (whitespace) in C++ without much issue.
(In reply to Gregory Szorc [:gps] from comment #6)
> JST's review script shouldn't be needed. And I even heard it doesn't work
> "right" any more.
> 
> We care about 3 languages:
> 
> - Python style checking is a solved problem (pyflakes, pep8, flake8, mccabe).
> 
> - Java style checking is a solved problem (too many tools to list).

Thanks for the reminder.  Bug 939892 is the equivalent for IDEs; not clear how hard we should work on this from the command line.
If we get this working on the command line and eventually as part of the tree, we can add it to automation and we can start achieving/enforcing/whatever proper code style with little to no human involvement. As I like to say, "every time I type 'nit' in a code review, Mozilla is wasting money on my salary." Humans shouldn't be performing work that machines can do better.
Clang-Format's documentation says that it will format code as per Mozilla's style guide.

http://clang.llvm.org/docs/ClangFormatStyleOptions.html
I /think/ clang-format's style is slightly out of date. Plus, it only works on whole files. The best we can do with clang-format is compare output vs input and report any discrepancies. And for that to work reliably, someone would have to mass-apply clang-format on the tree. Although, with VCS integration, you can filter out violations on lines not touched by the current patch. Although mapping input lines to output lines with significantly changed content is hard without having access to the C++ token stream (something you can get via the Clang C API or Python bindings). What I'm trying to say is C++ style enforcement is harder than Python and Java. Perhaps we should punt it to another bug.
FWIW, Phabricator supports client-side linting at patch upload time (http://www.phabricator.com/docs/phabricator/article/Arcanist_User_Guide_Customizing_Lint,_Unit_Tests_and_Workflows.html). Lint violations are reported in the code review interface. We could presumably do the same for bzexport. Although reporting violations in Splinter may be difficult. We could hook up a bot to perform auto-linting against Bugzilla or Review Board, though.

What I'm getting at is if there is one requirement it is that linting should be invokable as an API so many tools (mach, hg, git, arc, rb, etc) can perform it and can get at the results in a way that can be correlated to source. If you don't do things this way, the value of linting is significantly curtailed. It shouldn't take much work to design this right from the start.
I was wrong: clang-format supports only formatting limited sets of lines or byte ranges. It can also output replacements as XML, which makes the translations machine readable. I should know better than to doubt the awesomeness of the Clang tools.
(In reply to Gregory Szorc [:gps] from comment #3)
> First, what does lint test?

I was hoping to build this as part of bug 939329, which is to enable a linter in Firefox for Android's JavaScript code. What it would test would be decided specifically in bug 939347 (create a jshint config file), but it is intended to initially help developers catch errors before runtime that would ordinarily be caught with a compiler in applicable languages (e.g. using undeclared variables, syntax errors, etc.).

Style is a later plus (and easily set by a few jshint flags), but would require more churn of existing files so it's expected to be completed further down the road.

> Are you talking full tree linting? Partial
> tree/individual file linting? Both?
 
The command is expected (at least initially) to perform partial tree linting, in mobile/android/.

> Do you need integration with version control to only lint changed/affected
> files? If so, linting is probably best facilitated via a hook or additional
> command rather than a mach command.

Not necessarily, although I suppose using version control like you mentioned would cause less developer headache if a change that causes linter errors manages to get checked into the tree. Better yet, we can use the version control to prevent such changes from getting checked into the tree in the first place.

> Linting will require a lot of extra libraries that aren't in the tree
> currently. Do we wish to check them into the tree?

To be determined in bug 939344. Looks likely - downloading sources from the internet runs into offline servers, potential version mismatch, etc. that would be nice to avoid.

> At the core of this feature, I think we're talking about building a "linting
> API."

Good idea! I particularly like the benefit of tool independence as you mentioned in comment 11. Off the top of my head, I'm not quite sure how I would go about this (specifically ensuring the output is correlated to source), but I will research and ask questions if necessary.

I should mention that I don't know the specific needs of the parts of the codebase outside of mobile/android, so assuming I'm the one who ends up implementing this, scrutinize closely! :)
Greg, I wrote up a short spec of what I imagine this Python module/API would be like. Can you double check that this is what you were looking for?

---

lint_file(filename, parse_output=False)
    Runs the appropriate linter on the file of the given name. if parse_output
    is False, output is printed to stdout/stderr as if running the linter
    directly; None is returned as the second value in the tuple. If True, it
    also parses the output of the linter, converts it to lint_result, and
    returns this as the second value of the tuple.
    * Returns (linted_without_err (bool), lint_result)
lint_dir(dirname, parse_output=False)
    Runs lint_file on all of the files in the given directory.
    * Returns (linted_without_err (bool), [lint_result])
lint_file_list(filename_list, parse_output=False)
    Runs lint_file and lint_dir on the given list of input files as appropriate
    * Returns (linted_without_err (bool), [lint_result])

Types:
    linter: {name: (String), cmd: (String), line_parse_fn: (fn)}
        line_parse_fn parses the output of a line from the linter into a single
        lint_result.lint_output value
    lint_result: {
                  filename: (String),
                  linter: (linter),
                  lint_output: [ {line (int), col (int), msg (String)} ]
                 }

Since some linters written in Python may be more efficiently executed via an
import and function call, the linter type may need to be extended at a later
point to permit this.

Linters are stored in {extension: [linter]} (i.e. multiple linters by extension
possible)

The default linters could be potentially augumented via some methods such as:
    register_linter(ext, linter)

If run from the command line, this program will call lint_file_list on the
input args with parse_output=False (i.e. printing the output as if the linter
was called directly).

Each line_parse_fn would likely be written in a separate module, specific to
the linter.

Concerns:
    1) Some linters can take multiple files/direcotries via the command line
and lint them, so calling lint_file (which spawns a new process) on them
individually is inefficient; should this be a concern? Perhaps this can be a
followup for the specific linters that support this (or should we just assume
all of them do? - it'd remove the need for three differnt lint functions).
    2) I did not closely look at the API behind mach or hg to see how well this
might work with them, but it seems to be a simple function wrap and some
decorators.
    3) Maybe lint_dir should be merged as an if case of lint_file? I thought it
was good to be explicit though.
    4) Maybe lint_file and lint_dir should be private methods and
lint_file_list should be the only public method.
Flags: needinfo?(gps)
I probably won't get to this for a few days since I'm on a plane for 14 hours tomorrow and need to recover from jetlag.
Flags: needinfo?(gps)
Flags: needinfo?(gps)
See Also: → 875605
After check-moz-style (bug 875605) lands, it would be a good candidate for mach's C++ lint.
Sorry for the long lag. There is no excuse for me taking this long to get back to you. Please poke me if I do this again.

(In reply to Michael Comella (:mcomella) from comment #14)
> lint_file(filename, parse_output=False)
>     Runs the appropriate linter on the file of the given name. if
> parse_output
>     is False, output is printed to stdout/stderr as if running the linter
>     directly; None is returned as the second value in the tuple. If True, it
>     also parses the output of the linter, converts it to lint_result, and
>     returns this as the second value of the tuple.
>     * Returns (linted_without_err (bool), lint_result)
> lint_dir(dirname, parse_output=False)
>     Runs lint_file on all of the files in the given directory.
>     * Returns (linted_without_err (bool), [lint_result])
> lint_file_list(filename_list, parse_output=False)
>     Runs lint_file and lint_dir on the given list of input files as
> appropriate
>     * Returns (linted_without_err (bool), [lint_result])
> 
> Types:
>     linter: {name: (String), cmd: (String), line_parse_fn: (fn)}
>         line_parse_fn parses the output of a line from the linter into a
> single
>         lint_result.lint_output value
>     lint_result: {
>                   filename: (String),
>                   linter: (linter),
>                   lint_output: [ {line (int), col (int), msg (String)} ]
>                  }

This seems like a great start!

I'm tempted to invoke "perfect is the enemy of good" and say "who cares about a formal representation of lint results - let's just land something already." But I know that any lint infrastructure we write will eventually hook into version control, etc because that's what people will want. People will want the ability to easily go to the line where lint issues are detected. People will want to build dashboards. This will require machine readable output. The good news is most linting tools are pretty good about producing machine readable output. So I think for a little effort we can not sacrifice the future. Please check my reasoning!

> Since some linters written in Python may be more efficiently executed via an
> import and function call, the linter type may need to be extended at a later
> point to permit this.
> 
> Linters are stored in {extension: [linter]} (i.e. multiple linters by
> extension
> possible)
> 
> The default linters could be potentially augumented via some methods such as:
>     register_linter(ext, linter)
> 
> If run from the command line, this program will call lint_file_list on the
> input args with parse_output=False (i.e. printing the output as if the linter
> was called directly).
> 
> Each line_parse_fn would likely be written in a separate module, specific to
> the linter.

I like this approach. Idea: have a single function that receives the full set of directories/files/lines we desire to lint and let each linter implementation decide for itself what the most optimal execution mechanism is.

> Concerns:
>     1) Some linters can take multiple files/direcotries via the command line
> and lint them, so calling lint_file (which spawns a new process) on them
> individually is inefficient; should this be a concern? Perhaps this can be a
> followup for the specific linters that support this (or should we just assume
> all of them do? - it'd remove the need for three differnt lint functions).

See above. Consider combining the "lint()" API.

>     2) I did not closely look at the API behind mach or hg to see how well
> this
> might work with them, but it seems to be a simple function wrap and some
> decorators.

mach is just a function dispatcher with decorators to hook up the plumbing. See https://developer.mozilla.org/en-US/docs/Developer_Guide/mach#Adding_Features_to_mach

hg is slightly more involved, but the basic principle is the same. I can help you author an hg extension.

since mach and hg commands are effectively dispatchers, I think the important takeaway is that you develop linting as a standalone "library" that other tools can call into. If nothing else, this should make testing easier.
Flags: needinfo?(gps)
Assignee: nobody → michael.l.comella
Status: NEW → ASSIGNED
Summary: Add `mach lint` command → Create python linter wrapper
Attached patch Add python linter wrapper. (obsolete) — Splinter Review
This is a patch without output parsing support, cl args, and full
documentation. I reduced the API to a lint() function which will run the
linters on the appropriate files by extension.

Some missing info:
  * Built-in lint modules are located in wrappers/ and should have NAME (str),
EXTENSIONS (list), and lint (func) attr. See wrappers/jshint.py.
  * lint() will return lint_results, which are equivalent to the type mentioned
in comment 14
  * _linter_to_func and _extension_to_linter have no purpose now but should be
used later to run specific linters and file extensions from the command line.

Note: hg critic dislikes my __init__.py files and I'm not sure why.
Comment on attachment 8348450 [details] [diff] [review]
Add python linter wrapper.

Review of attachment 8348450 [details] [diff] [review]:
-----------------------------------------------------------------

::: python/lint/wrappers/jshint.py
@@ +8,5 @@
> +import util
> +
> +NAME = 'jshint'
> +EXTENSIONS = [
> +    'js'

And 'jsm' ?  I've found that jshint works fine over jsm files, though it struggles a little with globals.

@@ +11,5 @@
> +EXTENSIONS = [
> +    'js'
> +]
> +
> +_COMMAND = '/usr/bin/jshint'  # TODO: Use the jshint from the tree.

I'm not seeing any hint of jshint configuration here.  A reasonable set of defaults would ensure that people don't have to put /* jshint */ comments in every piece of code.

I'm not done yet, but here's a start:
{
  "moz": true,
  "browser": true,
  "globals": {
    "Components": true,
    "XPCOMUtils": true,
    "Services": true
  }
}
Comment on attachment 8348450 [details] [diff] [review]
Add python linter wrapper.

Review of attachment 8348450 [details] [diff] [review]:
-----------------------------------------------------------------

I like it!

Don't hesitate to ask me if you need help with any of the Python bits.

::: python/lint/lint.py
@@ +2,5 @@
> +# License, v. 2.0. If a copy of the MPL was not distributed with this
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> +
> +# TODO: Use logging module to to allow debug output.
> +# TODO: How to handle preprocessing?

mozbuild.preprocessing.Preprocessor contains the preprocessor class that is used throughout the tree. You can preprocess files "in process" and pipe the output to a temp file or to stdin of a linting process.

The hard part with preprocessing is setting up the options that individual files expect (i.e. integrating with the build config). We are actively working on better hookups of the preprocessor in build system land. This should become much easier in the months ahead. For now, it's probably best to leave it as a followup for a build peer to address later.

FWIW, we are trending towards a world where the build config knows more and more about files active in the current config. This /could/ be used to populate the set of files to lint. Not sure if this would bring any benefits over filesystem walking. Speaking of filesystem walking, see mozpack.files.FileFinder. e.g.

from mozpack.files import FileFinder
finder = FileFinder('.', find_executables=False, ignore=['obj*'])
for path, f in finder.find('**/*.js'):
    pass

@@ +24,5 @@
> +def lint(file_list, parse_output=False):
> +    """
> +    Runs the applicable linters on the given file list and returns a list of
> +    lint_result. TODO: Explain lint_result and return value.
> +    """

You should consider how you want to handle linting of in-memory strings. I suppose this could be a named argument. Or, you could just require the caller to pass in a file object. Although the problem with passing in file objects is that if you want to lint thousands of files, you may run into open file descriptor limits at the OS level.

@@ +25,5 @@
> +    """
> +    Runs the applicable linters on the given file list and returns a list of
> +    lint_result. TODO: Explain lint_result and return value.
> +    """
> +    assert type(file_list) is list

assert(isinstance(file_list, list))

Actually, it's probably better to check whether the thing is iterable and allow any iterable object to be passed in. See http://stackoverflow.com/questions/1952464/in-python-how-do-i-determine-if-an-object-is-iterable. Watch out for strings, which are iterable.

::: python/lint/wrappers/jshint.py
@@ +4,5 @@
> +
> +from os.path import isdir
> +import subprocess
> +
> +import util

Where does util come from?

@@ +11,5 @@
> +EXTENSIONS = [
> +    'js'
> +]
> +
> +_COMMAND = '/usr/bin/jshint'  # TODO: Use the jshint from the tree.

import which
try:
    jshint = which.which('jshint')
except which.WhichError:
    print('jshint not found. linter not available!')

@@ +20,5 @@
> +    Runs jshint on the given list of files. TODO: parse_output, parameter
> +    desc.
> +    """
> +    assert type(file_list) is list
> +    assert type(parse_output) is bool

isinstance

@@ +35,5 @@
> +
> +def _should_lint(filename):
> +    filename = filename.lower()
> +    extension_is_valid = any(map(lambda ext: filename.endswith(ext),
> +                                 EXTENSIONS))

Fun fact: str.endswith() accepts a tuple of items to match against!
Attachment #8348450 - Flags: feedback?(gps) → feedback+
I am interested..
I spoke with anra_ka via email and he will take a stab at it because I will not be able to look at it until at least 1/6.
Assignee: michael.l.comella → anilreddykatta
anra_ka, are you still working on this?

I'm back from vacation and will have time to work on this on the side so I can take over if you don't have time.
Flags: needinfo?(anilreddykatta)
Assignee: anilreddykatta → michael.l.comella
Flags: needinfo?(anilreddykatta)
(In reply to Gregory Szorc [:gps] from comment #20)
> You should consider how you want to handle linting of in-memory strings. I
> suppose this could be a named argument. Or, you could just require the
> caller to pass in a file object. Although the problem with passing in file
> objects is that if you want to lint thousands of files, you may run into
> open file descriptor limits at the OS level.

Why do we need to lint in-memory strings? Is it because of the open file limits mentioned above? Presumably, since I'm taking in file paths, we don't need to handle any issues with open file limits and it's up to the underlying linter implementations.
Flags: needinfo?(gps)
Attached patch Add python linter wrapper. (obsolete) — Splinter Review
I made changes from your feedback, besides the in-memory string stuff (still
waiting on that needinfo).

Additionally, I made some additions:
  * Use argparse to get files from the cl
  * Exit with return codes
  * Dynamically register built-in modules
  * Better subprocess handling in wrapper.jshint

Still TODO before this lands:
  * Documentation/comments
  * Decide on how to handle in-memory strings

TODO in a follow-up bug:
  * Parse jshint output to get in-memory data structures for manipulation
  * jshint config (bug 939347)
  * Integrate with either hg or mach to lint at qrefresh/build time.
  * Handle pre-processing (I'm not sure I entirely understood your
recommendations so don't know what the implications might be for ignoring pre-
processing for now)
Attachment #8348450 - Attachment is obsolete: true
Comment on attachment 8358108 [details] [diff] [review]
Add python linter wrapper.

Review of attachment 8358108 [details] [diff] [review]:
-----------------------------------------------------------------

Instead of having linters defined as a set of symbols on a module, I'd represent things as a class. The modeling will be much cleaner and it will be possible to enforce sanity more easily than if things are implemented as modules.

::: python/lint/lint.py
@@ +85,5 @@
> +    pkg_prefix = wrappers.__name__ + '.'
> +    for _, module_name, is_pkg in pkgutil.walk_packages(wrappers.__path__,
> +                                                        pkg_prefix):
> +        if not is_pkg:
> +            modules.append(import_module(module_name))

The preferred way to do this in Python is via entry points:

http://pythonhosted.org/setuptools/pkg_resources.html#entry-points

This will require you to create a setup.py file for this new Python package. That's not the worst thing in the world. We're trending in a direction where we'll use setup.py to populate the Python virtualenv instead of using .pth files.

::: python/lint/wrappers/jshint.py
@@ +48,5 @@
> +    except subprocess.CalledProcessError as e:
> +        # If we're not parsing and printing the output ourselves, it should
> +        # print directly to the shell.
> +        if not parse_output:
> +            print e.output

Nit: Use print() function, not statement, for future Python 3 compatibility.
Attachment #8358108 - Flags: feedback?(gps) → feedback+
(In reply to Gregory Szorc [:gps] from comment #26)dules.
> The preferred way to do this in Python is via entry points:
> 
> http://pythonhosted.org/setuptools/pkg_resources.html#entry-points
> 
> This will require you to create a setup.py file for this new Python package.
> That's not the worst thing in the world. We're trending in a direction where
> we'll use setup.py to populate the Python virtualenv instead of using .pth
> files.

I'm not too familiar with dist/setuptools and after spending some time with this, I think it'd be more effective for me (or someone else) to fix this as a followup.
Attached patch Add python linter wrapper. (obsolete) — Splinter Review
Used the print function and moved jshint wrapper code to a class. This let me
get rid of some dicts (linter_ext -> linter_name, linter_name -> linter_func)
and in general cleaned up the code. I'm not sure why using classes makes it
easier to enforce sanity (if you have the time, I wouldn't mind an elaboration)
but I do prefer the encapsulation.
Attachment #8358108 - Attachment is obsolete: true
Flags: needinfo?(gps)
Comment on attachment 8362019 [details] [diff] [review]
Add python linter wrapper.

Review of attachment 8362019 [details] [diff] [review]:
-----------------------------------------------------------------

Due to planned restructuring of how the virtualenv is populated, it's highly recommended to put files in a directory holding the name of the package. e.g. python/lint/lint/lint.py. This will make the transition to a setup.py install mechanism easier.

Sorry for the long lag.

::: python/lint/lint.py
@@ +40,5 @@
> +    overall_success = True
> +    lint_result_list = []
> +    # TODO: Print error if no applicable linters.
> +    for linter_class in _linter_class_list:
> +        linter = linter_class()  # TODO: Allocation too inefficient?

The allocation should not be a concern unless you have hundreds of linters or __init__ is doing something crazy.

@@ +56,5 @@
> +    linter_instance = linter_class()
> +    assert isinstance(linter_instance.NAME, str)
> +    assert isinstance(linter_instance.EXTENSIONS, Iterable)
> +    assert not isinstance(linter_instance.EXTENSIONS, str)
> +    assert callable(linter_instance.lint)

I would do this checking in __new__ or __init__ of the base linter class. There are also some decorators in abc (http://docs.python.org/2/library/abc.html) that allow you to require abstract methods and properties so child classes not defining them will raise.

Wait, is there even a base linter class? IMO there should be.

@@ +74,5 @@
> +                             'the current directory.')
> +    return parser.parse_args()
> +
> +
> +# TODO: Use entry_points in setuptools.

Our virtualenv doesn't (yet) integrate with setuptools and entry_points that well. I'd like to change this. But it doesn't contribute to build speed, so it's hard to justify working on it.

::: python/lint/wrappers/jshint.py
@@ +52,5 @@
> +        desc.
> +        """
> +        assert isinstance(file_list, Iterable)
> +        assert not isinstance(file_list, str)
> +        assert isinstance(parse_output, bool)

Excessive validation is considered an anti-pattern in Python. For "is bool" just |if var:| and let Python do the coercion. For iteration, usually |for foo in iterable| is a sufficient test.

@@ +92,5 @@
> +    def _parse_output(output):
> +        raise NotImplementedError('jshint._parse_output')
> +
> +    @property
> +    def NAME(self):

I wouldn't bother with caps.
Attachment #8362019 - Flags: feedback?(gps) → feedback+
mcomella: any updates on your lint wrapper? I look forward to seeing linters for Python, JS, and C++ integrated into a common mach interface. :)
Flags: needinfo?(michael.l.comella)
Unfortunately, I haven't had a chance to work on it recently. If someone else would like to continue it, it might go a little faster, but I'll see if I can dig in again this week.
Flags: needinfo?(michael.l.comella)
/r/3349 - Bug 939350: Add python linter wrapper. r=gps

Pull down this commit:

hg pull review -r ebe66faf086679eae71e1d2f9bda12a7850d19ce
Attachment #8559531 - Flags: review?(gps)
The commit in comment 33 allows me to run:

 mach python lint.py <file(s)/dir(s) to lint>

I addressed the issue from comment 30.

TODO in a follow-up bugs, if you agree this can land first:
  * Use entry_points over whatever I'm doing to get builtin-linters (comment 26)
  * Parse jshint output to get in-memory data structures for manipulation
  * jshint config (bug 939347)
  * Decide on how to handle in-memory strings
  * Integrate with either hg or mach to lint at commit/push/build time.
  * Handle pre-processing (I'm not sure I entirely understood your recommendations so don't know what the implications might be for ignoring pre-processing for now)
Also, inheriting docstrings from the ABC would improve my documentation, ala http://stackoverflow.com/a/2025581.
Attachment #8362019 - Attachment is obsolete: true
https://reviewboard.mozilla.org/r/3349/#review3001

This is progress.

Do you plan to add a `mach lint` command?

Also, it is easy to integrate Python linting into things. See /r/2835.

::: python/lint/lint/lint.py
(Diff revision 1)
> +    pkg_prefix = wrappers.__name__ + '.'
> +    for _, module_name, is_pkg in pkgutil.walk_packages(wrappers.__path__,
> +                                                        pkg_prefix):
> +        if not is_pkg:
> +            import_module(module_name)

I'd actually be surprised if this works without running setup.py. Does it?

::: python/lint/lint/wrappers/base_linter.py
(Diff revision 1)
> +        raise NotImplementedError

First, this will raise the `NotImplementedError` type, not an instance of that type. You almost always want to `raise NotImplementedError()`.

Second, `@abstractproperty` causes a failure at class parse time unless the property is defined in children. So, I'm pretty sure this line will never be executed.

Third, a lot of people write something like:

def foo():
    """Quick description."""
    pass
    
You only need `pass` on empty functions. And, a docstring makes a function non-empty. So, no `pass` is needed if there is a docstring.

::: python/lint/lint/lint.py
(Diff revision 1)
> +        is_success, lint_result = linter.lint(file_list, parse_output)

This should be wrapped in a try..except.
Conceivably relevant to this is that we've started using eslint in Hello.  https://dmose2.wordpress.com/2015/04/01/bootstrapping-eslint-for-hello/ has some details, and a link to the bug.
https://reviewboard.mozilla.org/r/3349/#review5777

Yes, I plan to use this to implement a `mach lint` command.

> I'd actually be surprised if this works without running setup.py. Does it?

It does run without setup.py when I run via `mach python lint.py <file(s)/dir(s) to lint>` within the python/lint/lint/ directory.
Comment on attachment 8559531 [details]
MozReview Request: bz://939350/mcomella

/r/3349 - Bug 939350: Add python linter wrapper. r=gps

Pull down this commit:

hg pull -r cc17295e2e3deffca9651be1634f411fb86353bc https://reviewboard-hg.mozilla.org/gecko/
I prefer to work in smaller bugs so it'd be better for me to land this ASAP and get working on followups (see comment 34) - what else do you need for this to land?
https://reviewboard.mozilla.org/r/3349/#review7157

::: python/lint/lint/lint.py:99
(Diff revision 2)
> +    for _, module_name, is_pkg in pkgutil.walk_packages(wrappers.__path__,
> +                                                        pkg_prefix):

I'm not sure this will work in our fake virtualenv setup. Did you test this?

::: python/lint/lint/lint.py:88
(Diff revision 2)
> +# TODO: Use entry_points in setuptools?

entry_points may not work with our fake virtualenv.

::: python/lint/lint/lint.py:75
(Diff revision 2)
> +def _parse_args():

The model we like to use is to have a function return the ArgumentParser so it can be used against any input, not just the implicit sys.argv which parse_args() assumes. But for now, this is probably OK.

::: python/lint/lint/lint.py:49
(Diff revision 2)
> +        if not linter.is_executable():
> +            continue

You probably want this to be modeled as an exception raised during lint(). It will enforce fewer requirements on how linter instances must be implemented.

::: python/lint/lint/wrappers/base_linter.py:40
(Diff revision 2)
> +    def is_executable(self):
> +        return bool(self._exec_path)

This assumes linters are executables. This won't always be true. Please roll this simple check into the implementation of each linter. Or create a base class that has the logic.

::: python/lint/lint/wrappers/base_linter.py:17
(Diff revision 2)
> +    def __new__(cls, *args, **kwargs):
> +        obj = object.__new__(cls, args, kwargs)
> +
> +        assert isinstance(obj.name, str)
> +        assert isinstance(obj.extensions, Iterable)
> +        assert not isinstance(obj.extensions, str)
> +        assert isinstance(obj.executable_name, str)
> +
> +        return obj

This level of type checking is typically omitted from Python programs on the basis it isn't necessary.

Also, "Iterable" is wonky. `isinstance('foo', Iterable)` is True.

::: python/lint/lint/wrappers/jshint.py:36
(Diff revision 2)
> +        file_list = filter(self._should_lint, file_list)

What if file_list is empty?

::: python/lint/lint/wrappers/base_linter.py:70
(Diff revision 2)
> +    @abstractproperty
> +    def name(self):
> +        """A user-recognizable name for the linter."""
> +
> +    @abstractproperty
> +    def extensions(self):
> +        """A list of file extensions this linter can operate on."""
> +
> +    @abstractproperty
> +    def executable_name(self):
> +        """The basename of the linter executable (e.g. "jshint")."""

I don't think you need these. Make them arguments to __init__ on this base class. Drop the _ prefixing in the child class. It just adds unnecessary layers of complexity.
Comment on attachment 8559531 [details]
MozReview Request: bz://939350/mcomella

Sorry the review took so long. I owe better. Please escalate to my manager if I ever do this again.
Attachment #8559531 - Flags: review?(gps)
Attachment #8559531 - Attachment is obsolete: true
I want to note this somewhere, but I don't know quite where.  Here'll do!

There's a Gradle plugin making it easy-ish to automatically format Java code:

https://github.com/youribonnaffe/gradle-format-plugin

This ticket tracks doing the right thing in automation, etc -- but we've had recent success improving our local story first.  We could try that approach again.
tldr; I say we wait for an organic solution that is more likely to fit our needs.

In its current state, the wrapper is:
  1) An aggregation of all linter commands so `mach lint` would lint the tree (e.g. for mobile, run both eslint and android-lint)
  2) A consistent way to return linter results for manipulation
  3) Python bindings to the linters (e.g. so we can easily merge it with hg and/or mach).

While 1) is nice, I think it will come about organically when some team has multiple lints to run (e.g. I'm working on both eslint and android-lint for mobile and, once they're both running regularly, might consider aggregating them, assuming automation runs them the same way).

2) doesn't seem as useful to me.

3) has potential, but it's unclear how these linters should fit into automation and thus if this python support is useful (and mach can just wrap the binaries/run scripts anyway).
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
(In reply to Nick Alexander :nalexander from comment #45)
> I want to note this somewhere, but I don't know quite where.  Here'll do!
> 
> There's a Gradle plugin making it easy-ish to automatically format Java code:

I filed bug 1181307 for this.
See Also: → 1230962
You need to log in before you can comment on or make changes to this bug.