Closed Bug 628746 Opened 15 years ago Closed 1 year ago

Add a precommit hook to SUMO to catch formatting syntax error in translations

Categories

(Webtools Graveyard :: Verbatim, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: stas, Unassigned)

References

Details

Attachments

(1 file, 1 obsolete file)

Attached file WIP hook (obsolete) —
It's a bit too easy currently for a localizer to accidentally break SUMO. Just to give an example of what I'm talking about: >>> "{0".format("a") Traceback (most recent call last): ... ValueError: unmatched '{' in format >>> "{0} {1} {2}".format("a", "b") Traceback (most recent call last): ... IndexError: tuple index out of range Oops. It should be possible to prevent Verbatim from committing a broken file using the precommit hook. Attached is a WIP hook (which should be called sumo.py, but I tested it on https://localize.mozilla.org/projects/testing/). I didn't want to install anything on Verbatim's VM, so I just copied jbalogh's code from [1] verbatim (pun intended, haha). That's a temporary solution while this is WIP. [1] https://github.com/jbalogh/tower/blob/checkpo/tower/management/commands/checkpo.py If there's a formatting syntax error in the file, the commit does not succeed. I'm still not sure why Pootle says: Failed to commit file: 'NoneType' object is not iterable Reading the code [2], it shouldn't catch the exception if `filestocommit` is an empty list (which is what the hook returns if it encounters a formatting syntax error). [2] http://translate.svn.sourceforge.net/viewvc/translate/src/trunk/Pootle/local_apps/pootle_translationproject/models.py?revision=17216&view=markup#l388 * * * Unfortunately, I just tested the hook again and it also prevents valid files from being committed (hey, I warned you that it was WIP! :) ). I'll investigate tomorrow.
Thanks for spearheading this, Stas! I, too, am interested in enabling such a hook on Flux projects like Input. Our goal should be to have a reusable svn commit hook that we can use on any of our projects. We can then import this into playdoh (or point to it) and suggest its use for all future projects.
I'm not sure why you need to add an extra dependency for this task (Babel) — the Translate Toolkit is available for the Pootle environment so that should be enough to read a PO file.
Hey Fred, thanks for sharing the plans for playdoh. I felt inspired, so I implemented the same thing as a pre-commit hook for svn :) It actually works this time (most of the times, at least; see below), and the added advantage is that you'll get errors if you work directly with SVN too, not only through Verbatim. I tested this in a throwaway repo that I set up locally, here are the steps to test yourself: > 1. svnadmin create ~/repo > 2. Put the hook in ~/repo/hooks > 3. chmod +x ~/repo/hooks/pre-commit > 4. svn co file:///home/user/repo ~/working_copy > 5. cd ~/working_copy > 6. curl http://viewvc.svn.mozilla.org/vc/projects/granary/testing/locale/pl/LC_MESSAGES/messages.po?revision=81739&view=co > messages.po > 7. svn add messages.po > 8. svn ci -m "This should fail" The hook blocks commits with files with invalid strings, where invalid means "the msgstr does not have the same placables as the msgid". See http://viewvc.svn.mozilla.org/vc/projects/granary/testing/locale/pl/LC_MESSAGES/messages.po?revision=81739&view=co for examples of such invalid strings. It kind of chokes, however, on actual syntax errors, like a missing matching brace. If you put something like the following in a msgid or msgstr, the hook will throw a ValueError (also causing a rollback of the commit...): # note the missing } in msgstr msgid "Delete {0}" msgstr "Usuń {0" Ideally, the hook should only check strings which are meant to be formatted, and while there is a gettext flag that means exactly this ('python-format'), Babel removes it from strings using the new formatting syntax ({}, not %) :/ See https://github.com/jbalogh/tower/commit/cf2cee44f196f127ec4d3910d0f4dec916bdc2a4#L0R62 for details. I haven't tested it with Verbatim yet. I'm not sure what the behavior will be when the SVN rejects the commit. Probably something similar to the errors we've already seen regarding lack of commit privileges and similar. I'm hoping that the list of actual invalid strings which the hook outputs to stderr will be displayed in Verbatim as well.
Attachment #506840 - Attachment is obsolete: true
Attachment #507426 - Flags: feedback?(fwenzel)
(In reply to comment #2) > I'm not sure why you need to add an extra dependency for this task (Babel) — > the Translate Toolkit is available for the Pootle environment so that should be > enough to read a PO file. I wanted to reuse Jeff's code from https://github.com/jbalogh/tower/blob/checkpo/tower/management/commands/checkpo.py without modifying it too much. I'll investigate TT for sure. As you can see from my previous comment, Babel has a problem with the 'python-format' flag. So, we might need to swap it for TT.
Here's some WIP patches to make this better: http://pastebin.mozilla.org/991932 1. Fix babel's python-format to detect .format() strings 2. Don't raise a value error if the string doesn't have %s (this happens when the singular version doesn't have %s but the plural does) http://pastebin.mozilla.org/991936 1. Use babel's python_format checker for %s strings. babel is smarter and allows the msgstr to drop a %s. For example: msgid: "%(num)s addon" msgstr: "one addon" >>> 'adadf' % {'a': 1} 'adadf' It's ok to drop named arguments from a string. I'd like to do a similar thing for .format() strings.
In the Kurdish translation of the AMO website, a translator committed as a translation for "{0} star" the string "Stêrkek", because it is correct to translate this without a number. We were told this would break things because of the missing "{0}".(Plural is "{0} stêrk") In Kurdish, in singulars this would always be the case, there is no number. We would like to do the correct translation without breaking things. Stas suggested to have "1 star" as the original string, since for a singular it's always "1", no variable needed.
%(name)s and {0} can be safely removed from strings. Bare %s cannot be removed. >>> '{0} star'.format(1) '1 star' >>> '1 star'.format(1) '1 star' >>> '%(num)s star' % {'num': 1} '1 star' >>> '1 star' % {'num': 1} '1 star' >>> '%s star' % 1 '1 star' >>> 'one star' % 1 ------------------------------------------------------------ Traceback (most recent call last): File "<ipython console>", line 1, in <module> TypeError: not all arguments converted during string formatting
I might have missed some earlier discussion, but why not just use msgfmt -cv? Or is the issue that the {} notation is not supported by it?
(In reply to comment #7) > %(name)s and {0} can be safely removed from strings. Bare %s cannot be removed. True, but I wonder how we could modify the check in the hook to detect this kind of scenarios. I think it would be easier to remove the {0} from singular en-US--then the check in the hook works correctly.
(In reply to comment #8) > I might have missed some earlier discussion, but why not just use msgfmt -cv? > Or is the issue that the {} notation is not supported by it? That's correct, msgfmt --check-format only supports the modulo notation.
I'd suggest also adding this as a variable style in Pootle. At least this way translators can review variable errors themselves to eliminate these. The commit hook would only catch actual errors. see translate/filter/checks.py:1229 and 1331
Just a thought: Can this same hook be used to automatically compile .mo files? Ideally, it'd check the validity of a .po file, and if it was accepted, it'll compile the corresponding .mo file and add it to the commit (debatable!). Whacha all think?
Staś, is this precommit hook now deployed in the repository? If it is, it would mean that all commits are tested for validity, not just from verbatim, which sounds great. In that case, is there anything outstanding here? Maybe just to add the Python variable check to Pootle's quality checks?
Comment on attachment 507426 [details] pre-commit hook for svn Removing the feedback flag for now, let me know if this is still current and I'll be happy to take another look.
Attachment #507426 - Flags: feedback?(fwenzel)
Product: Webtools → Webtools Graveyard
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: