Closed Bug 1288141 Opened 8 years ago Closed 8 years ago

[decision][l10n-conversion] document the intended outcome from the l10n migration process

Categories

(L20n :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Pike, Assigned: Pike)

References

(Blocks 1 open bug)

Details

User Story

The l10n migration process should create mercurial commits that are attributed to the person that touched any of the fragments that go into the string last.

The generated file should have comments from en-US, same whitespace and ordering.
Splitting this decision off of bug 1280685, to make a separate decision on this. I'll start off with a user story as food for thought. My proposal has two opinions: We should be opinionated on having en-US comments in the translated file. I'd like to migrate l10n comments, but I'm not sure if that's a likely scenario. We'd need examples to validate that.
Fielding some opinions on the user story and my last comment.
Flags: needinfo?(stas)
Flags: needinfo?(francesco.lodolo)
> The generated file should have comments from en-US, same whitespace and ordering. I'm fine with this approach, assuming that we will clean up the existing mess of comments and whitespaces when creating the en-US FTL content, and hopefully these new comments will be meaningful and consistent. We should also expect some locales to remove these comments afterwards. I don't believe there's value in keeping locale-specific comments, I consider that to be an extremely rare case, and localizers who put those comments should evaluate case by case if they're still relevant. Having useful blame information seems like a great pro, not sure how hard it is to implement.
Flags: needinfo?(francesco.lodolo)
My general preference is to unify the textual representation of FTL files as much as possible. Ultimately I'd like us to have a ftl-fmt tools which just formats translations according to a set of rules and good practices. The idea to include en-US comments is very much in line with this. +1 to that. I have questions about preserving the blame information. How much work do you estimate this would add to the task? The blame will still be available in hg on old files, so perhaps it's not worth it if it's a lo of extra effort to port. Also, if we intend to break files into smaller logical ones it might get even more complex. Would you try to group changes to multiple files in a single commit attributed to a single person?
Flags: needinfo?(stas)
IMHO there are two use-cases for blame: - Find out who matters for the content at hand. This brings me back to bug 1280685 comment 3, where I also answered about blame. - Find out where to start looking in history of things. I expect history to be lost. We're not going to be able to map old files to new files, let alone strings in one to the other, in version control. My quest for blame is somewhat built on all the time where I see hg@1 in blame when we switched from cvs to hg. It's just a very abrupt cut, and from practical exerience, nobody's gonna write anything to bridge the gap. What we have today is gecko-dev, which just didn't create the gap, and thus is helpful. That's why I would like to carry as much VCS information from old to new. As for the amount of commits, yes, I'd like to do aggressive squashing and commit reordering to reduce the amounts of commits we push.
We've discussed this on the last tech call, and agreed that the following is what we want to go with (unchanged from user story): The l10n migration process should create mercurial commits that are attributed to the person that touched any of the fragments that go into the string last. The generated file should have comments from en-US, same whitespace and ordering.
Assignee: nobody → l10n
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
(In reply to Axel Hecht [pto-Aug-30][:Pike] from comment #0) > I'd like to migrate l10n comments, but I'm not sure if that's a likely > scenario. We'd need examples to validate that. I did a little research on local comments in Firefox as part of bug 1293040. I tried to figure out if we could automatically detect a local comment. I only ran full analysis for German, because it was the only locale that expressed interesed in preserving local comments and because the analysis was semi-automatic. Summary: of 2608 German strings with comments, 122 strings have comments that are different than in English, but only 1 of them is an actual local comment. The remaining 121 comments are the outdated version of source comments.
Can we run this analysis on all hg-working locales/projects? Happy to help if you share the code. I'd hate to make a decision based on a single spot test.
(In reply to Axel Hecht [:Pike] from comment #7) > Can we run this analysis on all hg-working locales/projects? Happy to help > if you share the code. I'd hate to make a decision based on a single spot > test. I ran the analysis using my local instance of Pontoon which already had the current Firefox Aurora set up. I just had to add another 'Firefox Aurora German' project, because Pontoon only imports comments for source files: -- 1. Set up a new repository with de Aurora strings in the folder called en-US (https://github.com/mathjazz/fake-aurora/tree/master/locales). 2. Set up a new project in Pontoon linked to that repository and some (irrelevant which) locale enabled. 3. Run Pontoon sync for this project. 4. In Django shell, for all entities that have different comments for de and en-US print: - entity IDs (concatenated with file paths) - de comments - en-US comments (from the actual Firefox Aurora project) to form: https://bug1293040.bmoattachments.org/attachment.cgi?id=8781211. -- And finally -- use common sense, Google and Mercurial to figure out why the de and en-US comments differ.
You need to log in before you can comment on or make changes to this bug.