Closed
Bug 1317336
Opened 8 years ago
Closed 6 years ago
Make migrate.transforms.REPLACE work for more variants of gecko placable formats
Categories
(Localization Infrastructure and Tools :: Fluent Migration, defect)
Localization Infrastructure and Tools
Fluent Migration
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: Pike, Assigned: Pike)
References
(Blocks 1 open bug)
Details
(Whiteboard: [gecko-l20n])
Attachments
(2 files)
REPLACE currently is very limited to literal matches of placables in the source code to create results.
We'll need to enhance that to work with all the various flavors of placables in particular in properties.
Taking this out of the initial work on the python module in bug 1316640 to get to something landable.
Updated•7 years ago
|
Blocks: fluent.migrate
Comment 1•7 years ago
|
||
What formats should we support? I'm aware of StringBundle.getFormattedString's %S (and %1$S etc.). I also saw #1, #2 in files I worked with. What are those? Do they get replaced manually in the calling code?
Right now REPLACE takes a dict of literal specifiers to target FTL expression. I wonder if we should also allow an iterable of FTL expressions as an argument. The order would matter and we'd run a logic similar to that of StringBundle.getFormattedString to replace %S etc.
Comment 2•7 years ago
|
||
I can think of at least:
* %1$S, %S, $0.S
* %d (devtools)
* {{variable}} (PDF Viewer but also devtools)
* $BrandShortName (in installer properties)
Comment 3•7 years ago
|
||
Thanks. Am I right thinking that these:
> * {{variable}} (PDF Viewer but also devtools)
> * $BrandShortName (in installer properties)
Are fully handled by a map of replacements as implemented right now?
IIUC, the challenge is with the %S specifiers which may read %S in en-US but %2$S in the localization. This breaks in the map approach.
Comment 4•7 years ago
|
||
(In reply to Staś Małolepszy :stas from comment #3)
> Thanks. Am I right thinking that these:
>
> > * {{variable}} (PDF Viewer but also devtools)
> > * $BrandShortName (in installer properties)
>
> Are fully handled by a map of replacements as implemented right now?
I can't answer about that, I'm not really familiar with the migration code.
To answer the other part of your question, #1, #2 etc. are normally used in plural forms
https://transvision.mozfr.org/?recherche=%231&repo=gecko_strings&sourcelocale=en-US&locale=en-US&search_type=strings_entities
But there seem to be some weird cases, where those placeholders are replaced
https://hg.mozilla.org/l10n/gecko-strings/file/default/browser/chrome/browser/browser.properties#l404
https://hg.mozilla.org/l10n/gecko-strings/file/default/mobile/android/chrome/browser.properties#l69
http://searchfox.org/mozilla-central/rev/1ebd2eff44617df3b82eea7d2f3ca1b60cc591a0/mobile/android/chrome/content/browser.js#5139
Comment 5•7 years ago
|
||
(In reply to Francesco Lodolo [:flod] from comment #4)
> I can't answer about that, I'm not really familiar with the migration code.
The current code will look for a literal string "{{variable}}" or "$BrandShortName" and will replace it with some FTL expression as defined in teh migration.
> But there seem to be some weird cases, where those placeholders are replaced
> https://hg.mozilla.org/l10n/gecko-strings/file/default/browser/chrome/
> browser/browser.properties#l404
> https://hg.mozilla.org/l10n/gecko-strings/file/default/mobile/android/chrome/
> browser.properties#l69
> http://searchfox.org/mozilla-central/rev/
> 1ebd2eff44617df3b82eea7d2f3ca1b60cc591a0/mobile/android/chrome/content/
> browser.js#5139
Thanks for these examples. As long as localizers can't do anything funky to the specifier, i.e. change #1 to #1s or something, the mapping will work fine. The developer will say: replace #1 with FTL.Placeable(...) and in all locales, #1 will be found and replaced.
My understanding is that the above is not true for %S which can be spelled differently, e.g. as %1$S, in the localized files.
Comment 6•7 years ago
|
||
(In reply to Staś Małolepszy :stas from comment #5)
> My understanding is that the above is not true for %S which can be spelled
> differently, e.g. as %1$S, in the localized files.
Yes. Localizers are free to swap ordered/unordered placeholders, or use the zero-width trick to silence compare-locales warnings.
https://transvision.mozfr.org/?recherche=0.S&repo=gecko_strings&sourcelocale=en-US&locale=it&search_type=strings_entities
https://transvision.mozfr.org/?recherche=%251%240.S&repo=gecko_strings&sourcelocale=en-US&locale=it&search_type=strings_entities
Comment 7•7 years ago
|
||
Thanks again, this is exactly the kind of input I needed. This looks pretty complicated, too. Can there be both a %S and %1$S in one string?
Comment 8•7 years ago
|
||
(In reply to Staś Małolepszy :stas from comment #7)
> Thanks again, this is exactly the kind of input I needed. This looks pretty
> complicated, too. Can there be both a %S and %1$S in one string?
No, that's the only limit: you can't mix ordered and unordered placeholders in the same string.
Comment 9•7 years ago
|
||
I wrote a few tests to assert my understanding of the problem. Would you be able to take a look at them and check if they look right?
The setUp method mocks the legacy localized content from which the migration will source translations. Then, each test defines a migration spec (msg = ...) which in real-life would be written by the developer or l10n-drivers.
Assignee: nobody → stas
Status: NEW → ASSIGNED
Attachment #8923789 -
Flags: feedback?(l10n)
Attachment #8923789 -
Flags: feedback?(francesco.lodolo)
Comment 10•7 years ago
|
||
Comment on attachment 8923789 [details] [diff] [review]
Tests
Review of attachment 8923789 [details] [diff] [review]:
-----------------------------------------------------------------
It seems to make sense to me.
::: tests/migrate/test_replace.py
@@ +152,5 @@
> + def setUp(self):
> + self.strings = parse(PropertiesParser, '''
> + implicit = %S %S
> + explicit = %2$S %1$S
> + zero-width-implicit = %S%0.S
Add space: %S %0.S
@@ +153,5 @@
> + self.strings = parse(PropertiesParser, '''
> + implicit = %S %S
> + explicit = %2$S %1$S
> + zero-width-implicit = %S%0.S
> + zero-width-explicit = %2$S%1$0.S
Add space here too (consistency with the non zero width example): %2$S %1$0.S
@@ +212,5 @@
> +
> + self.assertEqual(
> + evaluate(self, msg).to_json(),
> + ftl_message_to_json('''
> + implicit = { $one } %S
Possibly unrelated to the tests, but we should raise a warning if we find ourselves in this kind of situation during migration.
Attachment #8923789 -
Flags: feedback?(francesco.lodolo) → feedback+
Comment 11•7 years ago
|
||
I modeled the %0.S examples after https://transvision.mozfr.org/?recherche=0.S&repo=gecko_strings&sourcelocale=en-US&locale=it&search_type=strings_entities. If we add the space the end result will be "{ $one } " with the trailing whitespace.
Also, the way this will be migrated will trigger compare-locales warnings for missing placeables. I don't know how to fix it but I can file a bug for this to start the discussion.
Comment 12•7 years ago
|
||
(In reply to Staś Małolepszy :stas from comment #11)
> I modeled the %0.S examples after
> https://transvision.mozfr.org/?recherche=0.
> S&repo=gecko_strings&sourcelocale=en-
> US&locale=it&search_type=strings_entities. If we add the space the end
> result will be "{ $one } " with the trailing whitespace.
True, but normally there would be other stuff in the middle. I think it's worth having a trailing whitespace, or maybe adding a word in the middle to making the test clearer?.
> Also, the way this will be migrated will trigger compare-locales warnings
> for missing placeables. I don't know how to fix it but I can file a bug for
> this to start the discussion.
Interesting. Unrelated question: wow do I avoid using a placeable and at the same time, not having compare-locales yelling at me?
Comment 13•7 years ago
|
||
(In reply to Francesco Lodolo [:flod] from comment #12)
> Interesting. Unrelated question: wow do I avoid using a placeable and at the
> same time, not having compare-locales yelling at me?
I feel like this is a very related question :) There's no way to do it in Fluent right now. We might want to use semantic comments for this (https://github.com/projectfluent/fluent/issues/16) in the future.
Assignee | ||
Comment 14•7 years ago
|
||
Comment on attachment 8923789 [details] [diff] [review]
Tests
Review of attachment 8923789 [details] [diff] [review]:
-----------------------------------------------------------------
To add to what flod said, we shouldn't create strings with printf params (if we have a printf substitution, that is).
Actually, I think we should either:
- not migrate this string at all
- create a proper fluent string with an unknown placable, creating a corresponding fluent c-l error.
implicit = { $one } { $fixme }
could be one.
Attachment #8923789 -
Flags: feedback?(l10n) → feedback+
Updated•7 years ago
|
No longer blocks: fluent.migrate
Component: Python Library → Fluent Migration
Product: L20n → Localization Infrastructure and Tools
Updated•6 years ago
|
Summary: Make l20n.migrate.transforms.REPLACE work for more variants of gecko placable formats → Make migrate.transforms.REPLACE work for more variants of gecko placable formats
Comment 15•6 years ago
|
||
From previous discussion with Stas, a good approach would be to normalize both source and target strings to use ordered arguments.
During normalization we should also remove zero-width arguments, e.g. (%0.S, %1$0.S)
Impossibile connettersi a Sync. Riprovare?%0.S
Non è stato possibile installare il motore di ricerca da: %2$S%1$0.S
Assignee | ||
Comment 16•6 years ago
|
||
Taking this. Also, not as simple as I had hoped, as bug 1496127 broke replacing numbered printfs (re.find("$") does weird things.)
Assignee: stas → l10n
Assignee | ||
Comment 17•6 years ago
|
||
%S %.0S %S -> %1$S %3$S
We normalize un-numbered printf arguments to numbered ones,
with just keeping the spec from the original format.
If the width was '0', then the argument was hidden in printf,
replicate that by removing the parameter completely.
Side-riding fix for a regression from bug 1496127, we need to
re.escape to find literal text fragments, otherwise things
like '$' end up being regex tokens.
Assignee | ||
Comment 18•6 years ago
|
||
Landed as https://hg.mozilla.org/l10n/fluent-migration/rev/635c8a4d83251ea18f4dd1e9c8fc5850df22e121.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•