Closed Bug 658493 Opened 13 years ago Closed 13 years ago

[pushes] command to update changesets in db from repo

Categories

(Webtools Graveyard :: Elmo, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Pike, Assigned: Pike)

Details

(Keywords: dogfood, Whiteboard: [hotfix])

Attachments

(2 files, 1 obsolete file)

I'm running into problems with the new -beta repos, those were just copied over and without a push, there's no mapping of changeset to repo.

I want to make two commands, one for listing which repos are affected (and quick running), and one for fixing a subset of them.
These commands do the right thing for me, tested against the l10n_site_test database.

Not sure who's good to review this, trying a mix of Peter and gandalf.
Attachment #533940 - Flags: review?(gandalf)
Attachment #533940 - Flags: review?(peterbe)
Comment on attachment 533940 [details] [diff] [review]
two commands, factored some of their code into __init__.py

Review of attachment 533940 [details] [diff] [review]:
-----------------------------------------------------------------

As often, I can't actually review the core of the functionality so I'm nit picking on pythonish semantics which are less important. So you get my r+

::: apps/pushes/management/commands/__init__.py
@@ +37,5 @@
> +
> +from django.conf import settings
> +import os.path
> +
> +

I would much prefer to see this in a "utils.py" file somewhere. __init__.py should ideally be avoided unless it's to make certain imports more convenient. 

Perhaps combine with a shared base class for the 'Command' used in both check- and fixrepos.py

@@ +48,5 @@
> +        for _name in names[1:]:
> +            q |= Q(name__startswith=_name)
> +        repos = repos.filter(q)
> +    else:
> +        repos = repos.all()

you can simplify this by starting with `.all()` and then adding the filter. Like this:

    repos = Repository.objects.all()
    if names:
        q = Q(name__startswith=names[0])
        for _name in names[1:]:
            q |= Q(name__startswith=_name)
        repos = repos.filter(q)

@@ +53,5 @@
> +    return repos
> +
> +
> +def resolve(path):
> +    return os.path.join(settings.REPOSITORY_BASE, *path.split('/'))

is splinter just being weird or is that missing a newline at the end?

::: apps/pushes/management/commands/checkrepos.py
@@ +44,5 @@
> +
> +class Command(BaseCommand):
> +    option_list = BaseCommand.option_list + (
> +        make_option('-a', '--all', action = 'store_true',
> +                    help = 'Refresh all repositories'),

make_option takes a `default=False`

@@ +50,5 @@
> +    help = 'Find repositories with missing changeset entries in the db.'
> +
> +    def handle(self, *args, **options):
> +        all = options.get('all', False)
> +        from mercurial.ui import ui as _ui

If there's an explicit reason to import this inside the method perhaps it should have a comment about that reasoning. 
Otherwise I'd vote for having imports at the top.

@@ +58,5 @@
> +
> +        from . import repos_for_names, resolve
> +
> +        repos = repos_for_names(*args)
> +        ui = _ui()

Just semantics but instead of the `as _ui` stuff you could just do it as a namespace like this:

    import mercurial.ui
    ui = mercurial.ui.ui()

No need for underscores

@@ +61,5 @@
> +        repos = repos_for_names(*args)
> +        ui = _ui()
> +        needs_nl = False
> +        for dbrepo in repos:
> +            repopath = str(resolve(dbrepo.name))

unicode() instead of str() please. 

Also, why not be more explicit and use:

    repopath = resolve(dbrepo.name).name

If resolve() is breaking for some reason and returns `None` you'd otherwise get `'None'` or `u'None'`.

::: apps/pushes/management/commands/fixrepos.py
@@ +39,5 @@
> +'''
> +
> +from optparse import make_option
> +
> +from django.core.management.base import BaseCommand, CommandError

CommandError is imported but never used.

@@ +45,5 @@
> +class Command(BaseCommand):
> +    option_list = BaseCommand.option_list + (
> +        make_option('-a', '--all', action = 'store_true',
> +                    help = 'Refresh all repositories'),
> +        )

This code is repeated in checkrepos.py

Instead, I'd prefer a subclass. Move the `Command` class to a file called base.py and create it as:

   class RepoCommand(BaseCommand):
       option_list = BaseCommand.option_list + ...
       def handle(self, *args, **options):
           raise NotImplementedError
Attachment #533940 - Flags: review?(peterbe) → review+
Peter, I've realized that I need to fix ten thousands of entries, per repo, so I've done some more-or-less funky SQL stuff. I also majorly refactored, as you suggested.

But on the SQL stuff, I'd appreciate your review.
Attachment #533940 - Attachment is obsolete: true
Attachment #533940 - Flags: review?(gandalf)
Attachment #534303 - Flags: review?(peterbe)
Comment on attachment 534303 [details] [diff] [review]
addressing comments, using sql instead, too, this is perf

Review of attachment 534303 [details] [diff] [review]:
-----------------------------------------------------------------

I'm struggling to review the InsertQuery bit because I can only review it by reading the code. I don't really understand how the command works.

There are numerous little PEP8 niggles such as missing whitespace. Consider installing check.py from https://github.com/jbalogh/check

::: apps/pushes/management/commands/fixrepos.py
@@ +111,5 @@
> +                break
> +
> +    def nodes(self, hgrepo):
> +        for i in xrange(len(hgrepo)):
> +            yield hgrepo[i].hex()

Not sure if it helps but to avoid having to execute the len() which breaks the generator pattern you can use the __iter__ available on the hgrepo like this instead:

    for i, __ in enumerate(hgrepo):
        yield hgrepo[i].hex()
Attachment #534303 - Flags: review?(peterbe) → review+
There are two bugs in the fixrepos command that I landed that I didn't find during testing, mostly because my local connection to the db is so slow that I never completed a fixup, I only ever improved. Also added defense-in-depth by putting the subclassed command into a try-except block.

With success, it turns out that I print "updated 0 changesets" because I don't increment the counter. Also, walking over repos that don't need action fails.
Attachment #534471 - Flags: review?(peterbe)
Comment on attachment 534471 [details] [diff] [review]
bustage fix, we gotta get better at this.

Review of attachment 534471 [details] [diff] [review]:
-----------------------------------------------------------------

The catch all Exception isn't great but I understand the importance of getting this landed. Also, writing tests for management commands is hard.

::: apps/pushes/management/base.py
@@ +65,5 @@
>                  continue
>              hgrepo = repository(ui, repopath)
> +            try:
> +                self.handleRepo(dbrepo, hgrepo)
> +            except Exception, e:

doing 'except Exception' is the same as 'except:' which is not a great idea. 

I'd rather we specify the exception we expect otherwise you might catch a NameError or OSError which really want to know about as a soon as possible. 

Also, we should log all errors that are unexpected like this:

    try:
        stuff
    except SomeException:
        logging.error("Unable to do the thing", exc_info=True)
Attachment #534471 - Flags: review?(peterbe) → review+
https://github.com/mozilla/elmo/commit/6d428e93ca1d2d6cd8e9418a8ae8b4d9e461db19, FIXED. I went a bit back and forth with Peter over irc on the error handling, we met somewhere in the middle.

This is also a bustage fix hotfix, and thus tagged as 2011-05-23.1
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → 1.2
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: