Closed
Bug 628746
Opened 15 years ago
Closed 1 year ago
Add a precommit hook to SUMO to catch formatting syntax error in translations
Categories
(Webtools Graveyard :: Verbatim, defect)
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: stas, Unassigned)
References
Details
Attachments
(1 file, 1 obsolete file)
7.83 KB,
text/plain
|
Details |
It's a bit too easy currently for a localizer to accidentally break SUMO. Just to give an example of what I'm talking about:
>>> "{0".format("a")
Traceback (most recent call last):
...
ValueError: unmatched '{' in format
>>> "{0} {1} {2}".format("a", "b")
Traceback (most recent call last):
...
IndexError: tuple index out of range
Oops.
It should be possible to prevent Verbatim from committing a broken file using the precommit hook.
Attached is a WIP hook (which should be called sumo.py, but I tested it on https://localize.mozilla.org/projects/testing/).
I didn't want to install anything on Verbatim's VM, so I just copied jbalogh's code from [1] verbatim (pun intended, haha). That's a temporary solution while this is WIP.
[1] https://github.com/jbalogh/tower/blob/checkpo/tower/management/commands/checkpo.py
If there's a formatting syntax error in the file, the commit does not succeed. I'm still not sure why Pootle says:
Failed to commit file: 'NoneType' object is not iterable
Reading the code [2], it shouldn't catch the exception if `filestocommit` is an empty list (which is what the hook returns if it encounters a formatting syntax error).
[2] http://translate.svn.sourceforge.net/viewvc/translate/src/trunk/Pootle/local_apps/pootle_translationproject/models.py?revision=17216&view=markup#l388
* * *
Unfortunately, I just tested the hook again and it also prevents valid files from being committed (hey, I warned you that it was WIP! :) ). I'll investigate tomorrow.
Comment 1•15 years ago
|
||
Thanks for spearheading this, Stas! I, too, am interested in enabling such a hook on Flux projects like Input. Our goal should be to have a reusable svn commit hook that we can use on any of our projects. We can then import this into playdoh (or point to it) and suggest its use for all future projects.
Comment 2•15 years ago
|
||
I'm not sure why you need to add an extra dependency for this task (Babel) — the Translate Toolkit is available for the Pootle environment so that should be enough to read a PO file.
Reporter | ||
Comment 3•15 years ago
|
||
Hey Fred, thanks for sharing the plans for playdoh. I felt inspired, so I implemented the same thing as a pre-commit hook for svn :)
It actually works this time (most of the times, at least; see below), and the added advantage is that you'll get errors if you work directly with SVN too, not only through Verbatim.
I tested this in a throwaway repo that I set up locally, here are the steps to test yourself:
> 1. svnadmin create ~/repo
> 2. Put the hook in ~/repo/hooks
> 3. chmod +x ~/repo/hooks/pre-commit
> 4. svn co file:///home/user/repo ~/working_copy
> 5. cd ~/working_copy
> 6. curl http://viewvc.svn.mozilla.org/vc/projects/granary/testing/locale/pl/LC_MESSAGES/messages.po?revision=81739&view=co > messages.po
> 7. svn add messages.po
> 8. svn ci -m "This should fail"
The hook blocks commits with files with invalid strings, where invalid means "the msgstr does not have the same placables as the msgid". See http://viewvc.svn.mozilla.org/vc/projects/granary/testing/locale/pl/LC_MESSAGES/messages.po?revision=81739&view=co for examples of such invalid strings.
It kind of chokes, however, on actual syntax errors, like a missing matching brace. If you put something like the following in a msgid or msgstr, the hook will throw a ValueError (also causing a rollback of the commit...):
# note the missing } in msgstr
msgid "Delete {0}"
msgstr "Usuń {0"
Ideally, the hook should only check strings which are meant to be formatted, and while there is a gettext flag that means exactly this ('python-format'), Babel removes it from strings using the new formatting syntax ({}, not %) :/ See https://github.com/jbalogh/tower/commit/cf2cee44f196f127ec4d3910d0f4dec916bdc2a4#L0R62 for details.
I haven't tested it with Verbatim yet. I'm not sure what the behavior will be when the SVN rejects the commit. Probably something similar to the errors we've already seen regarding lack of commit privileges and similar. I'm hoping that the list of actual invalid strings which the hook outputs to stderr will be displayed in Verbatim as well.
Attachment #506840 -
Attachment is obsolete: true
Attachment #507426 -
Flags: feedback?(fwenzel)
Reporter | ||
Comment 4•15 years ago
|
||
(In reply to comment #2)
> I'm not sure why you need to add an extra dependency for this task (Babel) —
> the Translate Toolkit is available for the Pootle environment so that should be
> enough to read a PO file.
I wanted to reuse Jeff's code from https://github.com/jbalogh/tower/blob/checkpo/tower/management/commands/checkpo.py without modifying it too much.
I'll investigate TT for sure. As you can see from my previous comment, Babel has a problem with the 'python-format' flag. So, we might need to swap it for TT.
Comment 5•15 years ago
|
||
Here's some WIP patches to make this better:
http://pastebin.mozilla.org/991932
1. Fix babel's python-format to detect .format() strings
2. Don't raise a value error if the string doesn't have %s (this happens when the singular version doesn't have %s but the plural does)
http://pastebin.mozilla.org/991936
1. Use babel's python_format checker for %s strings. babel is smarter and allows the msgstr to drop a %s. For example:
msgid: "%(num)s addon"
msgstr: "one addon"
>>> 'adadf' % {'a': 1}
'adadf'
It's ok to drop named arguments from a string. I'd like to do a similar thing for .format() strings.
Comment 6•15 years ago
|
||
In the Kurdish translation of the AMO website, a translator committed as a translation for "{0} star" the string "Stêrkek", because it is correct to translate this without a number. We were told this would break things because of the missing "{0}".(Plural is "{0} stêrk")
In Kurdish, in singulars this would always be the case, there is no number.
We would like to do the correct translation without breaking things.
Stas suggested to have "1 star" as the original string, since for a singular it's always "1", no variable needed.
Comment 7•15 years ago
|
||
%(name)s and {0} can be safely removed from strings. Bare %s cannot be removed.
>>> '{0} star'.format(1)
'1 star'
>>> '1 star'.format(1)
'1 star'
>>> '%(num)s star' % {'num': 1}
'1 star'
>>> '1 star' % {'num': 1}
'1 star'
>>> '%s star' % 1
'1 star'
>>> 'one star' % 1
------------------------------------------------------------
Traceback (most recent call last):
File "<ipython console>", line 1, in <module>
TypeError: not all arguments converted during string formatting
Comment 8•15 years ago
|
||
I might have missed some earlier discussion, but why not just use msgfmt -cv? Or is the issue that the {} notation is not supported by it?
Reporter | ||
Comment 9•15 years ago
|
||
(In reply to comment #7)
> %(name)s and {0} can be safely removed from strings. Bare %s cannot be removed.
True, but I wonder how we could modify the check in the hook to detect this kind of scenarios. I think it would be easier to remove the {0} from singular en-US--then the check in the hook works correctly.
Reporter | ||
Comment 10•15 years ago
|
||
(In reply to comment #8)
> I might have missed some earlier discussion, but why not just use msgfmt -cv?
> Or is the issue that the {} notation is not supported by it?
That's correct, msgfmt --check-format only supports the modulo notation.
Comment 11•15 years ago
|
||
I'd suggest also adding this as a variable style in Pootle. At least this way translators can review variable errors themselves to eliminate these. The commit hook would only catch actual errors.
see translate/filter/checks.py:1229 and 1331
Comment 12•14 years ago
|
||
Just a thought: Can this same hook be used to automatically compile .mo files? Ideally, it'd check the validity of a .po file, and if it was accepted, it'll compile the corresponding .mo file and add it to the commit (debatable!). Whacha all think?
Comment 13•14 years ago
|
||
Staś, is this precommit hook now deployed in the repository? If it is, it would mean that all commits are tested for validity, not just from verbatim, which sounds great. In that case, is there anything outstanding here? Maybe just to add the Python variable check to Pootle's quality checks?
Comment 14•14 years ago
|
||
Comment on attachment 507426 [details]
pre-commit hook for svn
Removing the feedback flag for now, let me know if this is still current and I'll be happy to take another look.
Attachment #507426 -
Flags: feedback?(fwenzel)
Updated•13 years ago
|
See Also: → http://bugs.locamotion.org/show_bug.cgi?id=2282
Assignee | ||
Updated•10 years ago
|
Product: Webtools → Webtools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•