Closed Bug 1196263 Opened 9 years ago Closed 9 years ago

Create a Python library for writing static analysis bots

Categories

(MozReview Graveyard :: General, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dminor, Assigned: dminor)

References

Details

Attachments

(4 files)

Creating a library to abstract away interactions with Pulse and other common tasks would make it easier to write static analysis bots.

We started work on a Python bot in Bug 1124530. We will take some code from there and use it as a basis for a library. This library will be used to write a simple bot that will be both a test of the library and an example of how to create a bot. Once that is complete, we'll rework the Python bot to make use of this library.

We'll write the library in Python on the assumption that most users can use the subprocess module to launch an existing static analysis tool in a separate process if it is written in language other than Python.

This is a Q3 deliverable for me.
Priority: P4 → P1
Depends on: 1124530
Blocks: 1124530
No longer depends on: 1124530
mozreviewbots: add a library for writing static analysis bots (bug 1196263) r?smacleod

This adds a library for writing static analysis bots for MozReview. The goal is
to abstract away the Pulse interactions (and eventually, any other code that is
shared between bots) to make it easier to write and maintain static analysis
bots.
Attachment #8650609 - Flags: review?(smacleod)
mozreviewbots: add SnarkBot (bug 1196263) r?smacleod

SnarkBot is both a simple example of how to write a static analysis bot for
MozReview and a means of testing the MozReviewBot base class.
Attachment #8650610 - Flags: review?(smacleod)
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/16647/#review15211

Clearing review as discussed.

::: pylib/mozreviewbots/mozreviewbotlib/batchreview.py:19
(Diff revision 1)
> +            'first_line': first_line,

As discussed in person we need to translate these file line numbers into the row numbers which Review Board takes (The code to do this can be found in the old review requests).
Attachment #8650609 - Flags: review?(smacleod)
Comment on attachment 8650610 [details]
MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/16649/#review15217

Clearing for now.

::: pylib/mozreviewbots/snarkbot/__main__.py:13
(Diff revision 1)
> +    def process_commits(self, repo_url, commits):
> +        """This function is called for each group of commits that is found on
> +           through Pulse.

We should have a discussion about the different pulse messages that will be supported, and the different modes of processing they'll correspond to for bots.

::: pylib/mozreviewbots/snarkbot/__main__.py:33
(Diff revision 1)
> +            # provided by the MozReview extension. You could also create the
> +            # comments individually if desired.

We should definitely not recommend this. The reason I'd written the batch review functionality is I'd originally had a bot which would use the normal API and it ended up bogging RB server's down quite a bit, and taking quite a while to post all the comments.

The batch review functionality is much preferred.
Attachment #8650610 - Flags: review?(smacleod)
Attachment #8650609 - Flags: review?(smacleod)
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

mozreviewbots: add a library for writing static analysis bots (bug 1196263) r?smacleod

This adds a library for writing static analysis bots for MozReview. The goal is
to abstract away the Pulse interactions (and eventually, any other code that is
shared between bots) to make it easier to write and maintain static analysis
bots.
Comment on attachment 8650610 [details]
MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps

mozreviewbots: add SnarkBot (bug 1196263) r?smacleod

SnarkBot is both a simple example of how to write a static analysis bot for
MozReview and a means of testing the MozReviewBot base class.
Attachment #8650610 - Flags: review?(smacleod)
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/16647/#review17033

::: pylib/mozreviewbots/requirements.txt:1
(Diff revision 2)
> +kombu

rbtools is also required. I'd think there would be some more requirements as well? how did you generate this file?
Attachment #8650609 - Flags: review?(smacleod)
https://reviewboard.mozilla.org/r/16647/#review17037

::: pylib/mozreviewbots/tests/hghave:7
(Diff revision 2)
> +execfile(os.path.join(REPO_ROOT, 'testing', 'hghave.py'))

It seems like some of this stuff is missing? did you miss some files to commit?

It seems like a lot of greg's code for managing repo clones etc. has gone missing too?
Comment on attachment 8650610 [details]
MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/16649/#review17035

::: pylib/mozreviewbots/snarkbot/__main__.py:41
(Diff revision 2)
> +            files = [] 

trailing whitespace.

::: pylib/mozreviewbots/snarkbot/__main__.py:82
(Diff revision 2)
> +    parser.add_argument('--config-path',
> +                        help='Path to configuration file for bot options')
> +    parser.add_argument('--reviewboard-url',
> +                        help='Host on which Reviewboard is running')
> +    parser.add_argument('--reviewboard-user', help='Reviewboard user')
> +    parser.add_argument('--reviewboard-password', help='Reviewboard password')
> +    parser.add_argument('--pulse-host',
> +                        help='Host on which Pulse is running')
> +    parser.add_argument('--pulse-port', type=int,
> +                        help='Port on which Pulse is listening')
> +    parser.add_argument('--pulse-userid', default='guest',
> +                        help='Pulse userid')
> +    parser.add_argument('--pulse-password', default='guest',
> +                        help='Pulse password')
> +    parser.add_argument('--exchange', default='exchange/mozreview/',
> +                        help='Pulse exchange')
> +    parser.add_argument('--pulse-timeout', default=1.0, type=float,
> +                        help='How long to wait for messages per iteration')
> +    parser.add_argument('--log-level', default='INFO',
> +                        help='Log level at which to log events')

a lot of this seems like boilerplate which each tool shouldn't have to include

::: pylib/mozreviewbots/snarkbot/__main__.py:110
(Diff revision 2)
> +    snark = SnarkBot(args.snark, config_path=args.config_path,
> +                     reviewboard_user=args.reviewboard_user,
> +                     reviewboard_password=args.reviewboard_password,
> +                     pulse_userid=args.pulse_userid,
> +                     pulse_password=args.pulse_password,
> +                     exchange=args.exchange, queue='all', routing_key='#',
> +                     pulse_timeout=args.pulse_timeout)
> +
> +    # This will drain all available messages on Pulse and call process_commits
> +    # for the commits associated with each message. Use listen_forever() to
> +    # process messages in a endless loop.
> +    snark.handle_available_messages()

Again, I feel like this is boilerplate. Maybe a lot of this stuff should be in a base class, or common entrypoint which calls out to your tool class?
Attachment #8650610 - Flags: review?(smacleod)
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

mozreviewbots: add a library for writing static analysis bots (bug 1196263) r?smacleod

This adds a library for writing static analysis bots for MozReview. The goal is
to abstract away the Pulse interactions (and eventually, any other code that is
shared between bots) to make it easier to write and maintain static analysis
bots.
Attachment #8650609 - Flags: review?(smacleod)
Comment on attachment 8650610 [details]
MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps

mozreviewbots: add SnarkBot (bug 1196263) r?smacleod

SnarkBot is both a simple example of how to write a static analysis bot for
MozReview and a means of testing the MozReviewBot base class.
Attachment #8650610 - Flags: review?(smacleod)
mozreviewbots: add python static analysis bot (bug 1196263) r?smacleod

This adds a static analysis bot for Python based upon the flake8 tool.
Attachment #8663102 - Flags: review?(smacleod)
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/16647/#review18423

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:57
(Diff revision 3)
> +            routing_key = routing_key or config.get('pulse', 'routing_key')

the default of '#' is truthy so we'll never actually check the config here.

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:58
(Diff revision 3)
> +            timeout = pulse_timeout or config.get('pulse', 'timeout')

the default of `60.0` is truthy so this will never actually check the config (unless someone passes something like `pulse_timeout=None` when instantiating... which seems strange).

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:90
(Diff revision 3)
> +        self.timeout = timeout
> +
> +        self.pulse_timeout = pulse_timeout

why do we have both of these, it seems like above we set timeout to pulse_timeout?

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:215
(Diff revision 3)
> +        """This parses the relevant files from the diff, returning a
> +           dictionary where the keys are the filenames and the values are the
> +           line numbers. Only files with paths for which predicate returns
> +           true are used.
> +        """

this docstring should be formatted according to pep 0257

::: pylib/mozreviewbots/mozreviewbotlib/batchreview.py:39
(Diff revision 3)
> +        return self._destfile_to_file[destfile]

we should probably call `.get()` here rather than directly accessing, that way if the user attempts to comment on an unchanged file, this returns `None` rather than throws an exception and we could indicate that to the caller in a nice way.

::: pylib/mozreviewbots/mozreviewbotlib/batchreview.py:79
(Diff revision 3)
> +        f = self.destfile_to_file(filename)

This is where we should make sure this returns something meaningful and only make the comment if it does.
Attachment #8650609 - Flags: review?(smacleod)
Comment on attachment 8650610 [details]
MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/16649/#review18425

::: pylib/mozreviewbots/snarkbot/__main__.py:6
(Diff revision 3)
> +    """This implements a simple review bot that can be used as a basis
> +       for creating more sophisticated reviewers. It is also used to test
> +       the MozReviewBot base class."""

One line summary and then a blank followed by more details. trailing `"""` should be on its own line with multi-line docstring

::: pylib/mozreviewbots/snarkbot/__main__.py:15
(Diff revision 3)
> +        """This function is called for each group of commits that is found on
> +           through Pulse.

should be kept to one line.

::: pylib/mozreviewbots/snarkbot/__main__.py:34
(Diff revision 3)
> +            review = BatchReview(self.api_root, rrid, diff_revision)

Part of me thinks this should already be created for the tool (or lazily created on property access or something)

::: pylib/mozreviewbots/snarkbot/__main__.py:36
(Diff revision 3)
> +            # We fetch the files that were changed in this commit. This list
> +            # could be filtered so we only look at certain file types before
> +            # we continue.

this also feels like something that should be in the base class so we don't need to touch the RB API client when we're creating one of these.
Attachment #8650610 - Flags: review?(smacleod)
Comment on attachment 8663102 [details]
MozReview Request: mozreviewbots: add python static analysis bot (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/19719/#review18349

::: pylib/mozreviewbots/pylintbot/__main__.py:90
(Diff revision 1)
> +    def __init__(self, *args, **kwargs):
> +        super(PylintBot, self).__init__(*args, **kwargs)

remove this since we're just calling super.
Attachment #8663102 - Flags: review?(smacleod)
https://reviewboard.mozilla.org/r/16647/#review18423

> this docstring should be formatted according to pep 0257

This is actually dead code that should have been removed.
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

mozreviewbots: add a library for writing static analysis bots (bug 1196263) r?smacleod

This adds a library for writing static analysis bots for MozReview. The goal is
to abstract away the Pulse interactions (and eventually, any other code that is
shared between bots) to make it easier to write and maintain static analysis
bots.
Attachment #8650609 - Flags: review?(smacleod)
Comment on attachment 8650610 [details]
MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps

mozreviewbots: add SnarkBot (bug 1196263) r?smacleod

SnarkBot is both a simple example of how to write a static analysis bot for
MozReview and a means of testing the MozReviewBot base class.
Attachment #8650610 - Flags: review?(smacleod)
Comment on attachment 8663102 [details]
MozReview Request: mozreviewbots: add python static analysis bot (bug 1196263) r=gps

mozreviewbots: add python static analysis bot (bug 1196263) r?smacleod

This adds a static analysis bot for Python based upon the flake8 tool.
Attachment #8663102 - Flags: review?(smacleod)
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/16647/diff/4-5/
Attachment #8650609 - Attachment description: MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r?smacleod → MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps
Attachment #8650609 - Flags: review?(smacleod) → review?(gps)
Comment on attachment 8650610 [details]
MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/16649/diff/4-5/
Attachment #8650610 - Attachment description: MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r?smacleod → MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps
Attachment #8650610 - Flags: review?(smacleod) → review?(gps)
Attachment #8663102 - Attachment description: MozReview Request: mozreviewbots: add python static analysis bot (bug 1196263) r?smacleod → MozReview Request: mozreviewbots: add python static analysis bot (bug 1196263) r=gps
Attachment #8663102 - Flags: review?(smacleod) → review?(gps)
Comment on attachment 8663102 [details]
MozReview Request: mozreviewbots: add python static analysis bot (bug 1196263) r=gps

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/19719/diff/2-3/
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/16647/#review23007

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:1
(Diff revision 5)
> +import hashlib

Need MPL header.

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:46
(Diff revision 5)
> +            reviewboard_url = reviewboard_url or config.get('reviewboard',
> +                                                            'url')
> +            reviewboard_user = reviewboard_user or config.get('reviewboard',

IIRC ConfigParser raises when accessing a section that doesn't exist. Not sure if you care enough catch this. I think aborting on invalid config file is fine. But the error message might not be intuitive.

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:87
(Diff revision 5)
> +        self.exchange = Exchange(exchange, type='topic', durable=True)
> +        self.queue = Queue(name=queue, exchange=self.exchange, durable=True,
> +                           routing_key=routing_key, exclusive=False,
> +                           auto_delete=False)

Be sure to perform the obligatory troll of mcote before this lands and ask him who creates the queue :P

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:96
(Diff revision 5)
> +        for DIR in os.environ['PATH'].split(os.pathsep):
> +            p = os.path.join(DIR, 'hg')
> +            if os.path.exists(p):
> +                self.hg = p

See also the "which" 3rd party package. If this is the only use, meh.

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:123
(Diff revision 5)
> +        env['HGRCPATH'] = '/dev/null'

Doing this globally is a bit dangerous, as it may disable extensions and settings there were deployed for e.g. performance reasons. Let's not do this unless we need to.

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:134
(Diff revision 5)
> +        # TODO: Use the root changeset in each repository as an identifier.
> +        #       This will enable "forks" to share the same local clone.
> +        #       Unfortunately, this isn't made available via `hg`.

The "share" extension now has support for this. Read `hg help -e share` for details about "pooled storage." We should probably deploy that.

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:137
(Diff revision 5)
> +        # TODO: Rather than cloning the review repository, we should be
> +        #       cloning the primary repository, e.g.
> +        #       clone hg.mozilla.org/mozilla-central rather than
> +        #       reviewboard-hg.mozilla.org/gecko

Cloning the review repos may take a while. I'd highly prefer to seed from the upstream repo if we can. We expose this metadata in Review Board, right?

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:183
(Diff revision 5)
> +        self._run_hg(['-R', repo_path, 'up', '-C', node])
> +        self._run_hg(['-R', repo_path, '--config',
> +                     'extensions.purge=', 'purge', '--all'])

I'm pretty sure we don't need to do this here. But we do need to do this for bots that need to access files.

::: pylib/mozreviewbots/mozreviewbotlib/batchreview.py:1
(Diff revision 5)
> +import json

MPL header.

::: pylib/mozreviewbots/mozreviewbotlib/batchreview.py:7
(Diff revision 5)
> +class BatchReview(object):
> +    def __init__(self, api_root, review_request_id, diff_revision,
> +                 max_comments=100, logger=None):

Where are the docstrings?

::: pylib/mozreviewbots/mozreviewbotlib/batchreview.py:66
(Diff revision 5)
> +        line_num_index = 4
> +        if original:
> +            line_num_index = 1
> +
> +        for chunk in diff_data.chunks:
> +            for row in chunk.lines:
> +                if row[line_num_index] == line_num:
> +                    return row[0]

There's a way to get this from Mercurial's internals. But we'd have to write an extension or something to dump the data. If we did this, we /might/ be able to avoid the API calls to Review Board. Not 100% sure about that.
Attachment #8650609 - Flags: review?(gps)
Attachment #8650610 - Flags: review?(gps) → review+
Comment on attachment 8650610 [details]
MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/16649/#review23009

::: pylib/mozreviewbots/snarkbot/__main__.py:12
(Diff revision 5)
> +        self.snark = snark
> +        super(SnarkBot, self).__init__(*args, **kwargs)

Typically super() is called first so you can overwrite anything it does.
Attachment #8663102 - Flags: review?(gps) → review+
Comment on attachment 8663102 [details]
MozReview Request: mozreviewbots: add python static analysis bot (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/19719/#review23011

::: pylib/mozreviewbots/pylintbot/__main__.py:1
(Diff revision 3)
> +from flake8.engine import get_style_guide

Need license header.

::: pylib/mozreviewbots/pylintbot/__main__.py:8
(Diff revision 3)
> +import argparse
> +import logging
> +import os
> +import random
> +import sys

I like putting stdlib imports first.

::: pylib/mozreviewbots/setup.py:18
(Diff revision 3)
> -    install_requires=['RBTools', 'kombu'],
> +    install_requires=['RBTools', 'flake8', 'kombu'],

Add pep8.

::: test-requirements.txt:25
(Diff revision 3)
> +flake8==2.4.1

flake8 will also install mccabe, pep8, and pyflakes. Please also add these.
https://reviewboard.mozilla.org/r/16647/#review23007

> See also the "which" 3rd party package. If this is the only use, meh.

Assuming this is the which you meant [1] (searching for 'which package python' is... difficult) it looks like it isn't being built for Python 2.7. This is the only use, so dropping.

[1] https://pypi.python.org/pypi/which/1.1.0
https://reviewboard.mozilla.org/r/16647/#review23007

> The "share" extension now has support for this. Read `hg help -e share` for details about "pooled storage." We should probably deploy that.

I don't think this will help with repositories we currently have on MozReview. I'll update the TODO to mention the share extension, but I'd prefer to leave this as a TODO for now.
mozreview: publish landing repository to pulse (bug 1196263) r=gps

Publishing the landing ("inbound") repository as well as the review repository
gives consumers such as static analysis bots the option of avoiding cloning
the potentially large review repository.
Attachment #8690849 - Flags: review?(gps)
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/16647/diff/5-6/
Attachment #8650609 - Flags: review?(gps)
Comment on attachment 8650610 [details]
MozReview Request: mozreviewbots: add SnarkBot (bug 1196263) r=gps

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/16649/diff/5-6/
Comment on attachment 8663102 [details]
MozReview Request: mozreviewbots: add python static analysis bot (bug 1196263) r=gps

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/19719/diff/3-4/
Comment on attachment 8690849 [details]
MozReview Request: mozreview: publish landing repository to pulse (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/25977/#review23441

::: hgext/reviewboard/tests/test-pulse-publishing.t:63
(Diff revision 1)
> +    landing_repository_url: ''

It would be really nice if this actually tested with a non-empty value.

::: pylib/mozreview/mozreview/pulse/__init__.py:60
(Diff revision 1)
> +    landing_repo_url = repo.extra_data.get('landing_repository_url', '')

Why not make this None so it appears as null in JSON? That seems better than an empty string.
Attachment #8690849 - Flags: review?(gps) → review+
Attachment #8650609 - Flags: review?(gps) → review+
Comment on attachment 8650609 [details]
MozReview Request: mozreviewbots: add a library for writing static analysis bots (bug 1196263) r=gps

https://reviewboard.mozilla.org/r/16647/#review23447

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:73
(Diff revision 6)
> +                exit(1)

I don't think exit() is defined. sys.exit() anywhere outside of the main function for a program is also IMO a bug because you shouldn't be exiting the process from a library function: you never know how your library will get used and patterns like these inhibit flexibility.

::: pylib/mozreviewbots/mozreviewbotlib/__init__.py:204
(Diff revision 6)
> +            diff_args.extend(['-U', unicode(diff_context)])

This mixes a str and unicode in the same list because we're not using unicode_literals. Assuming the value is a numeric, should probably be using `str(diff_context)`.
https://reviewboard.mozilla.org/r/25977/#review23441

> It would be really nice if this actually tested with a non-empty value.

I agree completely, but that would require configuring a custom hosting service through mach which isn't possible at the moment. I filed Bug 1227504 as a follow up, since I'm not sure how straightforward this will be.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
No longer blocks: 1229106
Product: Developer Services → MozReview
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: