Closed Bug 1376747 Opened 2 years ago Closed 2 years ago

[compare-locales][ftl] add check for internal message references that don't match en-US

Categories

(Localization Infrastructure and Tools :: compare-locales, enhancement)

enhancement
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Pike, Assigned: stas)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

This would mimick what we do for DTDs and external entity references:

Collect all internal message references in en-US, and warn if l10n references one that doesn't exist in en-US.
Blocks: 1280893
Blocks: 1424682
Note that even in Syntax 0.5 which adds private messages, we'll want to check references to both public and private messages. Private messages will be checked for presence and if referenced to from other messages. Attributes of private messages will not be checked at all by compare-locales.
Comment on attachment 8943600 [details]
Bug 1376747 - Check for message references that don't match en-US.

https://reviewboard.mozilla.org/r/213982/#review219678

Thanks for taking a stab at this, I have a couple of questions on the details that I'd also like flod's opinion on.

Side question, should we test for cyclic references?

    a = { b }
    b = { a }

::: compare_locales/checks.py:476
(Diff revision 1)
>  
> +        # verify that message references are the same
> +        ref_msg_refs = {}
> +        l10n_msg_refs = {}
> +
> +        def collect_message_references_into(refs):

Can we just move this to a method, and make it return the dict?

::: compare_locales/checks.py:484
(Diff revision 1)
> +            The returned function can be passed into BaseNode.traverse.
> +            '''
> +            def collect_message_references(node):
> +                if isinstance(node, ftl.MessageReference):
> +                    # The key is the name of the referenced message and it will
> +                    # be used in set arithmetic to find missing and obsolete

math guy here, we're not doing arithmetics.

... how about

`... it will be used as a set to ...`

::: compare_locales/checks.py:502
(Diff revision 1)
> +        l10n_msg_refs_names = set(l10n_msg_refs.keys())
> +
> +        missing_msg_ref_names = ref_msg_refs_names - l10n_msg_refs_names
> +        for msg_name in missing_msg_ref_names:
> +            yield ('error', 0, 'Missing message reference: ' + msg_name,
> +                   'fluent')

flod, what's your take on error or warning here?

I think that for branding, we have locales that reword UI dialogs to not have "Firefox" in them, so this should be a warning, at most?

::: compare_locales/checks.py:508
(Diff revision 1)
> +
> +        obsolete_msg_ref_names = l10n_msg_refs_names - ref_msg_refs_names
> +        for msg_name in obsolete_msg_ref_names:
> +            offset = l10n_msg_refs[msg_name].span.start - l10n_entry.span.start
> +            yield ('error', offset, 'Obsolete message reference: ' + msg_name,
> +                   'fluent')

Similar question here. I think a warning is a good idea, but an error would prevent the Japanese from using a glossary fluent file.

We might need to detect message references to public messages, and if they're in the ftl file, they're OK?

Private messages need to match?

Internal messages are OK?
Attachment #8943600 - Flags: review?(l10n) → review-
Formally flagging flod on my questions.

PS: we can iron out the internal message references for ja-JP-mac later, but not that much later.
Flags: needinfo?(francesco.lodolo)
Comment on attachment 8943600 [details]
Bug 1376747 - Check for message references that don't match en-US.

https://reviewboard.mozilla.org/r/213982/#review219678

I don't think we reliably can without knowing which files will be used to create a context.

> math guy here, we're not doing arithmetics.
> 
> ... how about
> 
> `... it will be used as a set to ...`

We're subtracting sets. I guess algebra is a better word?

> Similar question here. I think a warning is a good idea, but an error would prevent the Japanese from using a glossary fluent file.
> 
> We might need to detect message references to public messages, and if they're in the ftl file, they're OK?
> 
> Private messages need to match?
> 
> Internal messages are OK?

I'd prefer the public vs. private discussion to be part of the upgrade to python-fluent 0.6 and not part of this bug.
(In reply to Axel Hecht [:Pike] from comment #3)
> ::: compare_locales/checks.py:502
> (Diff revision 1)
> > +        l10n_msg_refs_names = set(l10n_msg_refs.keys())
> > +
> > +        missing_msg_ref_names = ref_msg_refs_names - l10n_msg_refs_names
> > +        for msg_name in missing_msg_ref_names:
> > +            yield ('error', 0, 'Missing message reference: ' + msg_name,
> > +                   'fluent')
> 
> flod, what's your take on error or warning here?
> 
> I think that for branding, we have locales that reword UI dialogs to not
> have "Firefox" in them, so this should be a warning, at most?

We need to go for warnings here, and review case per case. Plenty of languages want to reduce the plethora of brand names in English strings.

As far as I can tell, there's also no way to trick the system like we used to (e.g. %S -> %0.S), for cases where we want to drop the variable.

> ::: compare_locales/checks.py:508
> (Diff revision 1)
> > +
> > +        obsolete_msg_ref_names = l10n_msg_refs_names - ref_msg_refs_names
> > +        for msg_name in obsolete_msg_ref_names:
> > +            offset = l10n_msg_refs[msg_name].span.start - l10n_entry.span.start
> > +            yield ('error', offset, 'Obsolete message reference: ' + msg_name,
> > +                   'fluent')
> 
> Similar question here. I think a warning is a good idea, but an error would
> prevent the Japanese from using a glossary fluent file.

I assume you mean replicating what they're currently doing to convert ja from ja-JP-mac before it reaches the l10n repository?

Do we need to check for obsolete references at all, if we have no way to tell if they're available in the context? What happens practically to the UI for this type of strings (unresolved reference)?

In case, it should be a warning.
Assignee: nobody → stas
Flags: needinfo?(francesco.lodolo)
(In reply to Francesco Lodolo [:flod] from comment #6)

> As far as I can tell, there's also no way to trick the system like we used
> to (e.g. %S -> %0.S), for cases where we want to drop the variable.

That's correct. I don't really know how to allow something like this in FTL in the future. Semantic comments perhaps?

> Do we need to check for obsolete references at all, if we have no way to
> tell if they're available in the context? What happens practically to the UI
> for this type of strings (unresolved reference)?

Right now, a message defined as:

    foo = Unresolved { message-reference }.

would be formatted as:

    Unresolved message-reference.
(In reply to Staś Małolepszy :stas from comment #7)
> (In reply to Francesco Lodolo [:flod] from comment #6)
> 
> > As far as I can tell, there's also no way to trick the system like we used
> > to (e.g. %S -> %0.S), for cases where we want to drop the variable.
> 
> That's correct. I don't really know how to allow something like this in FTL
> in the future. Semantic comments perhaps?

That sounds definitely interesting (sort of es-lint exclusion).

> > Do we need to check for obsolete references at all, if we have no way to
> > tell if they're available in the context? What happens practically to the UI
> > for this type of strings (unresolved reference)?
> 
> Right now, a message defined as:
> 
>     foo = Unresolved { message-reference }.
> 
> would be formatted as:
> 
>     Unresolved message-reference.

OK, then warnings sounds like a good idea.
Comment on attachment 8943600 [details]
Bug 1376747 - Check for message references that don't match en-US.

https://reviewboard.mozilla.org/r/213982/#review219814
Attachment #8943600 - Flags: review?(l10n) → review+
Attachment #8943741 - Attachment is obsolete: true
Attachment #8943741 - Flags: review?(l10n)
https://hg.mozilla.org/l10n/compare-locales/rev/229648855a49091fb5fbb9b1162f2313e18388b0, marking FIXED.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.