Closed Bug 1361037 Opened 7 years ago Closed 7 years ago

prepare compare-locales to support generic configuration

Categories

(Localization Infrastructure and Tools :: compare-locales, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Pike, Assigned: Pike)

References

Details

Attachments

(15 files)

59 bytes, text/x-review-board-request
flod
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
stas
: review+
Details
59 bytes, text/x-review-board-request
stas
: review+
flod
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
stas
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
stas
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
stas
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
stas
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
stas
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
stas
: review+
Details
59 bytes, text/x-review-board-request
flod
: review+
stas
: review+
Details
For cross-channel, we need a more flexible configuration for compare-locales.

Track this.

Using this for Firefox will require such configs to land in Firefox, which is bug 1165906.
The current patch queue is already deep, and contains mostly breaking changes in order to generalize how files are found in compare-locales.

I also started to make the reference files not required by the code paths, so that we can add bilinguagal file formats easier.
Assignee: nobody → l10n
Comment on attachment 8868566 [details]
bug 1361037, breaking backwards compat calls for major version bump,

https://reviewboard.mozilla.org/r/140174/#review143486

::: compare_locales/__init__.py:1
(Diff revision 1)
> -version = "1.2.3"
> +version = "2.0-beta.1"

I agree on bumping the major version number, but 2.0-beta.1 is a peculiar version number. Any technical reason to not go for 2.0b1? I think it would be a lot more readable.
Attachment #8868566 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868567 [details]
bug 1361037, drop old fennec support hacks,

https://reviewboard.mozilla.org/r/135880/#review143488

::: commit-message-6640e:3
(Diff revision 1)
> +bug 1361037, drop old fennec support hacks, r?flod, stas
> +
> +Way back when Fennec was it's own app in a mobile-browser, I

it's -> its
Attachment #8868567 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868568 [details]
bug 1361037, remove support for webapps,

https://reviewboard.mozilla.org/r/135882/#review143502
Attachment #8868568 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868569 [details]
bug 1361037, unsupport compare-dirs,

https://reviewboard.mozilla.org/r/135884/#review143504
Attachment #8868569 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868570 [details]
bug 1361037, don't cache en-US reference,

https://reviewboard.mozilla.org/r/135886/#review143506

It sounds a bit surprising that caching doesn't help that much when running against multiple locales.
Attachment #8868570 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868571 [details]
bug 1361037, flyby flake8 fixes,

https://reviewboard.mozilla.org/r/135888/#review143510
Attachment #8868571 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868572 [details]
bug 1361037, part 0: convert from os.path to mozpath for consistent seperators in paths, use plain IO,

https://reviewboard.mozilla.org/r/136224/#review143582

Can't spot anything wrong with this one, nice how much using mozpath cleans up the code.
Attachment #8868572 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868566 [details]
bug 1361037, breaking backwards compat calls for major version bump,

https://reviewboard.mozilla.org/r/140174/#review143622

::: compare_locales/__init__.py:1
(Diff revision 1)
> -version = "1.2.3"
> +version = "2.0-beta.1"

I've went for http://semver.org/#spec-item-9, at least I tried.

I'm not 100% sure that I'd want semver going forward, but that's the inspiration.
Blocks: 1365746
Comment on attachment 8868573 [details]
bug 1361037, part 1: ProjectConfig abstraction, new and old filters, path matching and file iteration,

https://reviewboard.mozilla.org/r/138046/#review143972

Feels like the Python level skyrocketed beyond my skills with this patch :-\

Still going through the code with a few notes/questions, sadly a long way to go before finishing.

::: compare_locales/paths.py:20
(Diff revision 1)
> +    Supports path matching similar to mozpath.match(), but does
> +    not match trailing file paths without trailing wildcards.
> +    Also gets a prefix, which is the path before the first wildcard,
> +    which is good for filesystem iterations, and allows to replace
> +    the own matches in a path on a different Matcher. compare-locales
> +    uses that to transform l10n and en-US paths back and forth.

I feel like this needs a bit of rewording, assuming I understand what's going on (self.prefix and .sub don't seem related?). For example:

    Supports path matching similar to mozpath.match(), but does not match
    trailing file paths without explicit trailing wildcards. 
    Stores the path before the first wildcard as "prefix" to simplify filesystem
    iterations. 
    Allows to replace the matches in a path on a different Matcher object.
    compare-locales uses that to transform l10n and en-US paths back and forth.

::: compare_locales/paths.py:58
(Diff revision 1)
> +        Replace the wildcard matches in this pattern into the
> +        pattern of the other Match object.
> +        '''
> +        if not self.match(path):
> +            return None
> +        return self.regex.sub(other.placable, path)

Is this wanted behavior?

one = Matcher('foo/**')
other = Matcher('bar/*')
print one.sub(other, 'foo/baz')

::: compare_locales/paths.py:83
(Diff revision 1)
> +        TODO: We may support an additional locale key in the future.
> +        '''
> +        for d in paths:
> +            rv = {
> +                'l10n': d['l10n'],
> +                'module': d.get('module')

I'm lost. We don't have a definition of modules in the current config files?

::: compare_locales/paths.py:127
(Diff revision 1)
> +        '''Filter a localization file or entities within, according to
> +        this configuration file.'''
> +        if self.filter_py is not None:
> +            return self.filter_py(l10n_file.module, l10n_file.file,
> +                                  entity=entity)
> +        for rule in reversed(self.rules):

Any reason to not build self.rules already reversed (and explain why) in add_rules?

::: compare_locales/tests/test_paths.py:22
(Diff revision 1)
> +        self.assertTrue(one.match('foo/baz'))
> +        self.assertFalse(one.match('foo/baz/qux'))
> +        other = Matcher('bar/*')
> +        self.assertTrue(other.match('bar/baz'))
> +        self.assertFalse(other.match('bar/baz/qux'))
> +        self.assertEqual(one.sub(other, 'foo/baz'), 'bar/baz')

Should add tests for non matching paths? e.g. one.sub(other, 'bar/baz')
Comment on attachment 8868575 [details]
bug 1361037, part 3: remove ini-specific code that is now unused,

https://reviewboard.mozilla.org/r/138050/#review144084
Attachment #8868575 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868576 [details]
bug 1361037, part 4: move android check detection from checks to project config,

https://reviewboard.mozilla.org/r/139244/#review144090
Attachment #8868576 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868573 [details]
bug 1361037, part 1: ProjectConfig abstraction, new and old filters, path matching and file iteration,

https://reviewboard.mozilla.org/r/138046/#review144088

::: compare_locales/paths.py:58
(Diff revision 1)
> +        Replace the wildcard matches in this pattern into the
> +        pattern of the other Match object.
> +        '''
> +        if not self.match(path):
> +            return None
> +        return self.regex.sub(other.placable, path)

Mixing up wildcards doesn't work great, to not at all.

That's partly because how the regexes from mozpath work, as they're including the / in the match for /* and /**.

Not sure if it's worth fixing.

::: compare_locales/paths.py:83
(Diff revision 1)
> +        TODO: We may support an additional locale key in the future.
> +        '''
> +        for d in paths:
> +            rv = {
> +                'l10n': d['l10n'],
> +                'module': d.get('module')

We do, but I don't want to enforce them in the API to the new configs.

::: compare_locales/paths.py:127
(Diff revision 1)
> +        '''Filter a localization file or entities within, according to
> +        this configuration file.'''
> +        if self.filter_py is not None:
> +            return self.filter_py(l10n_file.module, l10n_file.file,
> +                                  entity=entity)
> +        for rule in reversed(self.rules):

I did it this way because then the documents for the evaluation match what the code does. As in, we're reversing at evaluation time, and not the opposite way around.
Comment on attachment 8868577 [details]
bug 1361037, part 5: support locale subsets,

https://reviewboard.mozilla.org/r/140176/#review144094

::: compare_locales/tests/test_paths.py:359
(Diff revision 1)
> +        self.assertListEqual(
> +            list(files),
> +            [
> +                ('/tmp/de/major/good.ftl', None, set()),
> +                ('/tmp/de/minor/good.ftl', None, set()),
> +                ])

Is the indentation right here? Same in the following test
Attachment #8868577 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868579 [details]
bug 1361037, optionally track file status in standard Observer,

https://reviewboard.mozilla.org/r/140180/#review144110

::: commit-message-491be:6
(Diff revision 1)
> +bug 1361037, optionally track file status in standard Observer, r?flod, stas
> +
> +This functionality previously lives in
> +https://github.com/Pike/master-ball/blob/b19f6163ff96e445cc37bade88c55338c52ee1da/vendor-local/l10ninsp/slave.py#L23-L34.
> +But that requires that I only run one project at a time, as we otherwise
> +loose which files are in which project.

nits: lived, lose

::: compare_locales/compare.py:166
(Diff revision 1)
> +            self.file_stats = defaultdict(lambda: defaultdict(dict))
>  
>      # support pickling
>      def __getstate__(self):
> -        return dict(summary=self.getSummary(), details=self.details)
> +        state = dict(summary=self._dictify(self.summary), details=self.details)
> +        if self.file_stats is not None:

Should this be a oneliner?

self.file_stats = defaultdict(lambda: defaultdict(dict)) if file_stats else None

::: compare_locales/compare.py:209
(Diff revision 1)
>              if (self.filter is not None and
>                      self.filter(file) in (None, 'ignore')):
>                  return 'ignore'
>              self.summary[file.locale][category] += data
> +            if self.file_stats is not None:
> +                # missingInFiles is missing, for stat purposes

Uhm, can you clarify this comment?
Attachment #8868579 - Flags: review?(francesco.lodolo) → review+
(In reply to Francesco Lodolo [:flod] from comment #31)
> > +                # missingInFiles is missing, for stat purposes
> 
> Uhm, can you clarify this comment?

From looking at other patches, maybe "#missingInFiles is counted as 'missing' for stats"?
Comment on attachment 8868579 [details]
bug 1361037, optionally track file status in standard Observer,

https://reviewboard.mozilla.org/r/140180/#review144110

> Should this be a oneliner?
> 
> self.file_stats = defaultdict(lambda: defaultdict(dict)) if file_stats else None

I'm not a big fan of one liners, unless they're really trivial. Understanding the type is hard enough, so I'd just keep this separate.

> Uhm, can you clarify this comment?

As you hinted in the bug, a missing file (for historic reasons) is counted independently from missing strings in existing files. Back when we did this, people worked in VCS, and if you didn't start on something yet, you didn't have a file. With pootle committing happily files with just comments, that's not as useful as it used to be.

For the per-file statistics, I dropped the distinction between "you don't have the file at all" and "there's stuff missing in this file".

I should up the comment.
Comment on attachment 8868580 [details]
bug 1361037, Cleanup stats handling,

https://reviewboard.mozilla.org/r/140182/#review144128
Attachment #8868580 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868579 [details]
bug 1361037, optionally track file status in standard Observer,

https://reviewboard.mozilla.org/r/140180/#review144144

::: commit-message-491be:6
(Diff revision 1)
> +bug 1361037, optionally track file status in standard Observer, r?flod, stas
> +
> +This functionality previously lives in
> +https://github.com/Pike/master-ball/blob/b19f6163ff96e445cc37bade88c55338c52ee1da/vendor-local/l10ninsp/slave.py#L23-L34.
> +But that requires that I only run one project at a time, as we otherwise
> +loose which files are in which project.

Fixed.

::: compare_locales/compare.py:209
(Diff revision 1)
>              if (self.filter is not None and
>                      self.filter(file) in (None, 'ignore')):
>                  return 'ignore'
>              self.summary[file.locale][category] += data
> +            if self.file_stats is not None:
> +                # missingInFiles is missing, for stat purposes

The comment is now

# missingInFiles should just be "missing" in file stats
Comment on attachment 8868578 [details]
bug 1361037, part 6: support multiple projects in one compare-locales run,

https://reviewboard.mozilla.org/r/140178/#review144132

The args splitting seems brittle, but I can't think of anything better and as readable. For example, if the config file doesn't exist, you would end up with an empty args.config

I though about using something like this

while all_args and not os.path.isdir(all_args[0]):
    args.config.append(all_args.pop(0))
if not all_args:
    # all_args is empty, there's no valid path in the provided arguments
    raise ArgumentTypeError('No valid path provided in the arguments')
args.l10n_base_dir = all_args.pop(0)
args.locales.extend(all_args)
Attachment #8868578 - Flags: review?(francesco.lodolo) → review+
Attachment #8868566 - Flags: review?(stas)
Attachment #8868567 - Flags: review?(stas)
Attachment #8868568 - Flags: review?(stas)
Attachment #8868569 - Flags: review?(stas)
Attachment #8868570 - Flags: review?(stas)
Attachment #8868571 - Flags: review?(stas)
Attachment #8868573 - Flags: review?(francesco.lodolo)
Attachment #8868573 - Flags: review?(francesco.lodolo)
Comment on attachment 8868573 [details]
bug 1361037, part 1: ProjectConfig abstraction, new and old filters, path matching and file iteration,

https://reviewboard.mozilla.org/r/138046/#review144586

::: compare_locales/paths.py:129
(Diff revision 1)
> +        if self.filter_py is not None:
> +            return self.filter_py(l10n_file.module, l10n_file.file,
> +                                  entity=entity)
> +        for rule in reversed(self.rules):
> +            matcher = Matcher(
> +                rule.get('path', '.'),

Should we update the docs allowing for empty paths in rules?

::: compare_locales/paths.py:133
(Diff revision 1)
> +            matcher = Matcher(
> +                rule.get('path', '.'),
> +                l10n_file.locale)
> +            if not matcher.match(l10n_file.fullpath):
> +                continue
> +            if ('key' in rule) ^ (entity is not None):

I'm not following the logic here. If there is a rule that applies to an entire file, why would a call to filter with an entity not return that action?

In other words, I would expect self.cfg.filter(self.file, self.entity) to return 'ignore' in test_single_file_rule
Comment on attachment 8868573 [details]
bug 1361037, part 1: ProjectConfig abstraction, new and old filters, path matching and file iteration,

https://reviewboard.mozilla.org/r/138046/#review144600

::: compare_locales/paths.py:20
(Diff revision 1)
> +    Supports path matching similar to mozpath.match(), but does
> +    not match trailing file paths without trailing wildcards.
> +    Also gets a prefix, which is the path before the first wildcard,
> +    which is good for filesystem iterations, and allows to replace
> +    the own matches in a path on a different Matcher. compare-locales
> +    uses that to transform l10n and en-US paths back and forth.

Yeah.

::: compare_locales/paths.py:129
(Diff revision 1)
> +        if self.filter_py is not None:
> +            return self.filter_py(l10n_file.module, l10n_file.file,
> +                                  entity=entity)
> +        for rule in reversed(self.rules):
> +            matcher = Matcher(
> +                rule.get('path', '.'),

Actually, I think this needs a fix, I don't allow rules to have no path in _compileRule.

::: compare_locales/paths.py:133
(Diff revision 1)
> +            matcher = Matcher(
> +                rule.get('path', '.'),
> +                l10n_file.locale)
> +            if not matcher.match(l10n_file.fullpath):
> +                continue
> +            if ('key' in rule) ^ (entity is not None):

The intent is that you can create a rule that a file is missing, but if the file's there, you need the keys.

::: compare_locales/tests/test_paths.py:22
(Diff revision 1)
> +        self.assertTrue(one.match('foo/baz'))
> +        self.assertFalse(one.match('foo/baz/qux'))
> +        other = Matcher('bar/*')
> +        self.assertTrue(other.match('bar/baz'))
> +        self.assertFalse(other.match('bar/baz/qux'))
> +        self.assertEqual(one.sub(other, 'foo/baz'), 'bar/baz')

Yeah.
Comment on attachment 8868573 [details]
bug 1361037, part 1: ProjectConfig abstraction, new and old filters, path matching and file iteration,

https://reviewboard.mozilla.org/r/138046/#review144604

::: compare_locales/paths.py:133
(Diff revision 1)
> +            matcher = Matcher(
> +                rule.get('path', '.'),
> +                l10n_file.locale)
> +            if not matcher.match(l10n_file.fullpath):
> +                continue
> +            if ('key' in rule) ^ (entity is not None):

Right, it's even explained in the docs.

This entry is optional and, if missing, the filter will only apply to missing or additional files in the localization. If you want to ignore all entries in a file, specify a generic key, re:..
Comment on attachment 8868573 [details]
bug 1361037, part 1: ProjectConfig abstraction, new and old filters, path matching and file iteration,

https://reviewboard.mozilla.org/r/138046/#review145058

This feels more like an f+, I trust stas' review more than mine with this amount and complexity of code.

::: compare_locales/paths.py:205
(Diff revision 2)
> +                    if (mozpath.realpath(m['reference'].prefix) !=
> +                            mozpath.realpath(m_.get('reference').prefix)):
> +                        raise RuntimeError('Mismatch in reference for ' +
> +                                           mozpath.realpath(m['l10n'].prefix))
> +                drops.add(i_ + i + 1)
> +                m['test'] |= m_['test']

Took me a while to figure this out, maybe worth a comment.

Personally I find .update more readable

m['test'].update(m_['test'])

If two rules are identical for l10n and reference the second is dropped, but we keep tests. Is that a scenario we want/need to support (same files, different tests)?
Attachment #8868573 - Flags: review?(francesco.lodolo) → review+
Comment on attachment 8868574 [details]
bug 1361037, part 2: compare-locales uses project config under the hood,

https://reviewboard.mozilla.org/r/138048/#review145072
Attachment #8868574 - Flags: review?(francesco.lodolo) → review+
(In reply to Axel Hecht [:Pike] from comment #29)
> ::: compare_locales/paths.py:83
> (Diff revision 1)
> > +        TODO: We may support an additional locale key in the future.
> > +        '''
> > +        for d in paths:
> > +            rv = {
> > +                'l10n': d['l10n'],
> > +                'module': d.get('module')
> 
> We do, but I don't want to enforce them in the API to the new configs.

Can you clarify this point? It's still unclear to me. Are you referring to the modules in filter.py or something else completely?
Comment on attachment 8868573 [details]
bug 1361037, part 1: ProjectConfig abstraction, new and old filters, path matching and file iteration,

https://reviewboard.mozilla.org/r/138046/#review145058

> Took me a while to figure this out, maybe worth a comment.
> 
> Personally I find .update more readable
> 
> m['test'].update(m_['test'])
> 
> If two rules are identical for l10n and reference the second is dropped, but we keep tests. Is that a scenario we want/need to support (same files, different tests)?

I basically had two options here, assert if the tests aren't equal, or just add them up.

For tests, I figured we'd be less strict, but I'd like to avoid that in practice.
(In reply to Francesco Lodolo [:flod] from comment #57)
> (In reply to Axel Hecht [:Pike] from comment #29)
> > ::: compare_locales/paths.py:83
> > (Diff revision 1)
> > > +        TODO: We may support an additional locale key in the future.
> > > +        '''
> > > +        for d in paths:
> > > +            rv = {
> > > +                'l10n': d['l10n'],
> > > +                'module': d.get('module')
> > 
> > We do, but I don't want to enforce them in the API to the new configs.
> 
> Can you clarify this point? It's still unclear to me. Are you referring to
> the modules in filter.py or something else completely?

The modules stuff is only needed as long as we have legacy filter.py files, as they're part of the python API of that code path.

For the TOML-side of life, the filters work on the complete local path, and don't care about the difference between 'dom' and 'chrome/plugins.properties' in dom/chrome/plugins.properties.
Comment on attachment 8868572 [details]
bug 1361037, part 0: convert from os.path to mozpath for consistent seperators in paths, use plain IO,

https://reviewboard.mozilla.org/r/136224/#review145152

::: compare_locales/mozpath.py:115
(Diff revision 2)
> +        return True
> +    if pattern not in re_cache:
> +        p = re.escape(pattern)
> +        p = re.sub(r'(^|\\\/)\\\*\\\*\\\/', r'\1(?:.+/)?', p)
> +        p = re.sub(r'(^|\\\/)\\\*\\\*$', r'(?:\1.+)?', p)
> +        p = p.replace(r'\*', '[^/]*') + '(?:/.*)?$'

Reading and debugging regexes is hard. Perhaps provide a short comment on what each of these re.subs does?
Attachment #8868572 - Flags: review?(stas) → review+
Comment on attachment 8868573 [details]
bug 1361037, part 1: ProjectConfig abstraction, new and old filters, path matching and file iteration,

https://reviewboard.mozilla.org/r/138046/#review145154

::: compare_locales/paths.py:16
(Diff revision 2)
>  
>  
> +class Matcher(object):
> +    '''Path pattern matcher
> +    Supports path matching similar to mozpath.match(), but does
> +    not match trailing file paths without trailing wildcards.

Why is this important for the use-case of compare-locales?

::: compare_locales/paths.py:36
(Diff revision 2)
> +        p = re.sub(r'(^|\\\/)\\\*\\\*\\\/', r'\1(.+/)?', p)
> +        p = re.sub(r'(^|\\\/)\\\*\\\*$', r'(\1.+)?', p)
> +        p = p.replace(r'\*', '([^/]*)') + '$'
> +        r = re.escape(pattern)
> +        r = re.sub(r'(^|\\\/)\\\*\\\*\\\/', r'\\\\0', r)
> +        r = re.sub(r'(^|\\\/)\\\*\\\*$', r'\\\\0', r)

Again, perhaps document the regexes a bit?

::: compare_locales/paths.py:127
(Diff revision 2)
> +        '''Filter a localization file or entities within, according to
> +        this configuration file.'''
> +        if self.filter_py is not None:
> +            return self.filter_py(l10n_file.module, l10n_file.file,
> +                                  entity=entity)
> +        for rule in reversed(self.rules):

Why is it important that self.rules be reversed here?

::: compare_locales/paths.py:190
(Diff revision 2)
> +                    m['reference'] = paths['reference']
> +                m['test'] = set(paths.get('test', []))
> +                self.matchers.append(m)
> +        self.matchers.reverse()  # we always iterate last first
> +        drops = set()  # duplicate matchers to remove
> +        for i, m in enumerate(self.matchers[:-1]):

It took me a few minutes to understand the logic here. How about adding a comment?

"Compare each matcher against all other matchers.  Avoid n^2 comparisons by only scanning the upper triangle of a n × n matrix of all possible combinations."

::: compare_locales/tests/test_paths.py:224
(Diff revision 2)
> +                '/tmp/somedir/{locale}/toolkit/two/one/file.ftl',
> +            ],
> +            'key': [
> +                'one_entity',
> +                'other_entity',
> +            ],

What is the use-case for specifying more than one path here?  Would forbidding this help make the rules code easier to maintain?  (And also make the code in _compile_rule simpler).
Attachment #8868573 - Flags: review?(stas) → review+
Comment on attachment 8868574 [details]
bug 1361037, part 2: compare-locales uses project config under the hood,

https://reviewboard.mozilla.org/r/138048/#review146432
Attachment #8868574 - Flags: review?(stas) → review+
Comment on attachment 8868575 [details]
bug 1361037, part 3: remove ini-specific code that is now unused,

https://reviewboard.mozilla.org/r/138050/#review146434
Attachment #8868575 - Flags: review?(stas) → review+
Comment on attachment 8868576 [details]
bug 1361037, part 4: move android check detection from checks to project config,

https://reviewboard.mozilla.org/r/139244/#review146436

::: compare_locales/checks.py:200
(Diff revision 2)
>      xmllist = set(('amp', 'lt', 'gt', 'apos', 'quot'))
>  
> -    def __init__(self, reference):
> +    def __init__(self, extra_tests, reference):
> +        super(DTDChecker, self).__init__(extra_tests)
> +        if self.extra_tests is not None and 'android-dtd' in self.extra_tests:
> +            self.processContent = True

It looks like processContent used to be a method which subclasses could override.  With the addition of processAndroidContent, is processContent just a bool flag now?  Should it be renamed to self.needs_processing or something similar?

::: compare_locales/checks.py:356
(Diff revision 2)
>                  for s in refMap.iterkeys():
>                      msgs.insert(0, '%s only in reference' % s)
>                  if msgs:
>                      yield ('warning', 0, ', '.join(msgs), 'css')
>  
> -        if self.processContent is not None:
> +        if self.extra_tests is not None and 'android-dtd' in self.extra_tests:

Is it necessary to repeat the condition from __init__ here?  Wouldn't checking for self.processContent be enough?
Attachment #8868576 - Flags: review?(stas) → review+
Comment on attachment 8868577 [details]
bug 1361037, part 5: support locale subsets,

https://reviewboard.mozilla.org/r/140176/#review146444

::: compare_locales/tests/test_paths.py:359
(Diff revision 2)
> +        self.assertListEqual(
> +            list(files),
> +            [
> +                ('/tmp/de/major/good.ftl', None, set()),
> +                ('/tmp/de/minor/good.ftl', None, set()),
> +                ])

Nit: something's wrong with the indentation here.
Attachment #8868577 - Flags: review?(stas) → review+
Comment on attachment 8868578 [details]
bug 1361037, part 6: support multiple projects in one compare-locales run,

https://reviewboard.mozilla.org/r/140178/#review146452

::: compare_locales/commands.py:88
(Diff revision 2)
>  not locales given, the list of locales will be taken from the all-locales file
>  of the application\'s l10n.ini."""
>  
>      def get_parser(self):
>          parser = super(CompareLocales, self).get_parser()
> -        parser.add_argument('ini_file', metavar='l10n.ini',
> +        parser.add_argument('config', metavar='l10n.ini', nargs='+',

As a heavy CLI user I find nargs="*" confusing when used with named arguments.

Rather than something like this:

    $ compare-locales --config a.ini b.ini ...

I'd much prefer this:

    $ compare-locales --config a.ini --config b.ini ...

I think this can be achieved in argsparse with action='append'.

::: compare_locales/commands.py:109
(Diff revision 2)
>          self.add_data_argument(parser)
>          return parser
>  
>      def handle(self, args):
> -        app = EnumerateApp(args.ini_file, args.l10n_base_dir, args.locales)
> -        project_config = app.asConfig()
> +        # using nargs multiple times in argparser totally screws things
> +        # up, repair that.

Sounds like going for action='append' would also help you avoid this additional code here.
Attachment #8868578 - Flags: review?(stas) → review-
Comment on attachment 8868572 [details]
bug 1361037, part 0: convert from os.path to mozpath for consistent seperators in paths, use plain IO,

https://reviewboard.mozilla.org/r/136224/#review146472

::: compare_locales/mozpath.py:115
(Diff revision 2)
> +        return True
> +    if pattern not in re_cache:
> +        p = re.escape(pattern)
> +        p = re.sub(r'(^|\\\/)\\\*\\\*\\\/', r'\1(?:.+/)?', p)
> +        p = re.sub(r'(^|\\\/)\\\*\\\*$', r'(?:\1.+)?', p)
> +        p = p.replace(r'\*', '[^/]*') + '(?:/.*)?$'

I've copied and pasted them from mozpath, and I don't know why they look the way they do. I've spent an hour then to find out when they started to look like this, and they just did.

We'd need an upstream bug to figure this out.
Comment on attachment 8868573 [details]
bug 1361037, part 1: ProjectConfig abstraction, new and old filters, path matching and file iteration,

https://reviewboard.mozilla.org/r/138046/#review146474

::: compare_locales/paths.py:16
(Diff revision 2)
>  
>  
> +class Matcher(object):
> +    '''Path pattern matcher
> +    Supports path matching similar to mozpath.match(), but does
> +    not match trailing file paths without trailing wildcards.

Because it's really hard to tell what should match and what shouldn't. I figured this out while writing tests for this stuff, I consider this to be a bug in mozpack.path, tbh, but I don't care enough to drive a fix throughout the tree.

::: compare_locales/paths.py:127
(Diff revision 2)
> +        '''Filter a localization file or entities within, according to
> +        this configuration file.'''
> +        if self.filter_py is not None:
> +            return self.filter_py(l10n_file.module, l10n_file.file,
> +                                  entity=entity)
> +        for rule in reversed(self.rules):

This is documented in http://moz-l10n-config.readthedocs.io/en/latest/fileformat.html#evaluation.

The reason why flod and I did this is twofold: For one, you have an explicit behavior, is it's important that they're sorted.

The reason for them being reversed is that you can write the general rules first, and then specify the exceptions to that below. That is much more readable than having to start out with the details first, and after a while to get to the common case.

::: compare_locales/paths.py:190
(Diff revision 2)
> +                    m['reference'] = paths['reference']
> +                m['test'] = set(paths.get('test', []))
> +                self.matchers.append(m)
> +        self.matchers.reverse()  # we always iterate last first
> +        drops = set()  # duplicate matchers to remove
> +        for i, m in enumerate(self.matchers[:-1]):

Yeah.

::: compare_locales/tests/test_paths.py:224
(Diff revision 2)
> +                '/tmp/somedir/{locale}/toolkit/two/one/file.ftl',
> +            ],
> +            'key': [
> +                'one_entity',
> +                'other_entity',
> +            ],

Yes, this is to make the configuration files easier to maintain.
Comment on attachment 8868576 [details]
bug 1361037, part 4: move android check detection from checks to project config,

https://reviewboard.mozilla.org/r/139244/#review146492

::: compare_locales/checks.py:356
(Diff revision 2)
>                  for s in refMap.iterkeys():
>                      msgs.insert(0, '%s only in reference' % s)
>                  if msgs:
>                      yield ('warning', 0, ', '.join(msgs), 'css')
>  
> -        if self.processContent is not None:
> +        if self.extra_tests is not None and 'android-dtd' in self.extra_tests:

I used the test semantics to make it forwards compatible.

We might add other extra tests in the future, some of which might depend on the text content, others might now.
Comment on attachment 8868577 [details]
bug 1361037, part 5: support locale subsets,

https://reviewboard.mozilla.org/r/140176/#review146494

::: compare_locales/tests/test_paths.py:359
(Diff revision 2)
> +        self.assertListEqual(
> +            list(files),
> +            [
> +                ('/tmp/de/major/good.ftl', None, set()),
> +                ('/tmp/de/minor/good.ftl', None, set()),
> +                ])

flod complained about this in another location, too.

flake8 is totally fine with it, though.
Comment on attachment 8868578 [details]
bug 1361037, part 6: support multiple projects in one compare-locales run,

https://reviewboard.mozilla.org/r/140178/#review146496

Rerequesting review, as stas r-'ed this based on the arguments being the arguments would be named arguments. They're all positional.

::: compare_locales/commands.py:88
(Diff revision 2)
>  not locales given, the list of locales will be taken from the all-locales file
>  of the application\'s l10n.ini."""
>  
>      def get_parser(self):
>          parser = super(CompareLocales, self).get_parser()
> -        parser.add_argument('ini_file', metavar='l10n.ini',
> +        parser.add_argument('config', metavar='l10n.ini', nargs='+',

Just that all of these are positional arguments.

> compare-locales browser/locales/l10n.ini mobile/android/locales/l10n.ini ../l10n-central de it pl

is how the command like looks.
Attachment #8868578 - Flags: review?(l10n)
Attachment #8868578 - Flags: review?(stas)
Attachment #8868578 - Flags: review?(l10n)
Attachment #8868578 - Flags: review-
Comment on attachment 8868578 [details]
bug 1361037, part 6: support multiple projects in one compare-locales run,

https://reviewboard.mozilla.org/r/140178/#review146502

::: compare_locales/commands.py:88
(Diff revision 2)
>  not locales given, the list of locales will be taken from the all-locales file
>  of the application\'s l10n.ini."""
>  
>      def get_parser(self):
>          parser = super(CompareLocales, self).get_parser()
> -        parser.add_argument('ini_file', metavar='l10n.ini',
> +        parser.add_argument('config', metavar='l10n.ini', nargs='+',

Oh, this is even more confusing\!  How open to changing this are you?  The only positional arguments here should be the locale names, since it's called compare-\*locales\*.
Comment on attachment 8868578 [details]
bug 1361037, part 6: support multiple projects in one compare-locales run,

https://reviewboard.mozilla.org/r/140178/#review146502

> Oh, this is even more confusing\!  How open to changing this are you?  The only positional arguments here should be the locale names, since it's called compare-\*locales\*.

I actually feel strongly that named arguments are optional arguments. It's also tedious to type, whereas files and directories nicely tab-complete.

I don't think that the command line should take the name of the command into account too much.
Comment on attachment 8868578 [details]
bug 1361037, part 6: support multiple projects in one compare-locales run,

https://reviewboard.mozilla.org/r/140178/#review146516
Attachment #8868578 - Flags: review?(stas) → review+
Comment on attachment 8868578 [details]
bug 1361037, part 6: support multiple projects in one compare-locales run,

https://reviewboard.mozilla.org/r/140178/#review146502

> I actually feel strongly that named arguments are optional arguments. It's also tedious to type, whereas files and directories nicely tab-complete.
> 
> I don't think that the command line should take the name of the command into account too much.

I guess it's an r+ on the rest of this patch then, with a note that I disagree on the CLI interface.  I don't think it makes sense to block on this.
Comment on attachment 8868579 [details]
bug 1361037, optionally track file status in standard Observer,

https://reviewboard.mozilla.org/r/140180/#review146534
Attachment #8868579 - Flags: review?(stas) → review+
Comment on attachment 8868580 [details]
bug 1361037, Cleanup stats handling,

https://reviewboard.mozilla.org/r/140182/#review146554

::: compare_locales/compare.py:436
(Diff revision 3)
> +    def updateStats(self, file, stats):
> +        """Check observer for the found data, and if it's
> +        not to ignore, notify other_observers.
> +        """
> +        for observer in self.observers + self.other_observers:
> +            observer.updateStats(file, stats)

Looking at this code I started to wonder where self.observers is being filled with items.  There's only add_observer in ContentComparer and it adds to self.other_observers.

I then remembered that in https://reviewboard.mozilla.org/r/140178/diff/2 there are a few places with:

    cc = ContentComparer()
    cc.observers.append(Observer())
    
For readability and consistency it might be a good idea to rename the current add_observer to add_other_observer and add a new method for appending to self.observers: add_observer (exact naming TBD).
Attachment #8868580 - Flags: review?(stas) → review+
Comment on attachment 8868576 [details]
bug 1361037, part 4: move android check detection from checks to project config,

https://reviewboard.mozilla.org/r/139244/#review146880

::: compare_locales/checks.py:200
(Diff revision 2)
>      xmllist = set(('amp', 'lt', 'gt', 'apos', 'quot'))
>  
> -    def __init__(self, reference):
> +    def __init__(self, extra_tests, reference):
> +        super(DTDChecker, self).__init__(extra_tests)
> +        if self.extra_tests is not None and 'android-dtd' in self.extra_tests:
> +            self.processContent = True

I kept the name, but made a boolean, and only set it in __init__.
Comment on attachment 8868577 [details]
bug 1361037, part 5: support locale subsets,

https://reviewboard.mozilla.org/r/140176/#review146888

::: compare_locales/tests/test_paths.py:359
(Diff revision 2)
> +        self.assertListEqual(
> +            list(files),
> +            [
> +                ('/tmp/de/major/good.ftl', None, set()),
> +                ('/tmp/de/minor/good.ftl', None, set()),
> +                ])

flake8 is also fine with a level of indention less, fixed consistently throughout the patch queue.
Comment on attachment 8868580 [details]
bug 1361037, Cleanup stats handling,

https://reviewboard.mozilla.org/r/140182/#review146920

::: compare_locales/compare.py:436
(Diff revision 3)
> +    def updateStats(self, file, stats):
> +        """Check observer for the found data, and if it's
> +        not to ignore, notify other_observers.
> +        """
> +        for observer in self.observers + self.other_observers:
> +            observer.updateStats(file, stats)

Yeah, I'll file a follow-up bug on that.
https://hg.mozilla.org/l10n/compare-locales/pushloghtml?changeset=24750ee963fd, and landed some forgotten review nits in 
https://hg.mozilla.org/l10n/compare-locales/rev/34d99a0dd563d4939b8c9e13fc60591e24243014.

Marking this bug fixed, and I'm going to have an extra bug to actually read the toml files. This bug has been long enough.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Summary: compare-locales should support generic configuration → prepare compare-locales to support generic configuration
Blocks: 1368025
Blocks: 1368026
(In reply to Axel Hecht [:Pike] from comment #80)
> Comment on attachment 8868580 [details]
> bug 1361037, Cleanup stats handling,
> 
> https://reviewboard.mozilla.org/r/140182/#review146920
> 
> ::: compare_locales/compare.py:436
> (Diff revision 3)
> > +    def updateStats(self, file, stats):
> > +        """Check observer for the found data, and if it's
> > +        not to ignore, notify other_observers.
> > +        """
> > +        for observer in self.observers + self.other_observers:
> > +            observer.updateStats(file, stats)
> 
> Yeah, I'll file a follow-up bug on that.

Filed bug 1368026.
Blocks: 1372254
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: