Closed Bug 399014 Opened 18 years ago Closed 16 years ago

we need an l10n-merge tool

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: Pike, Assigned: zbraniecki)

References

Details

Attachments

(2 files, 8 obsolete files)

Pur first attempt; the latest source on the Google Doc from our website 18 years ago Armen [:armenzg] 6.42 KB, application/x-zip-compressed		Details
brief instructions on how to make run our little attempt to make an l10n-merge tool 18 years ago Armen [:armenzg] 2.16 KB, text/plain		Details
l10n-merge script package 18 years ago dynamis (Tomoya ASAI) 42.88 KB, application/zip		Details
patch to fix treatment of ignored entities 18 years ago dynamis (Tomoya ASAI) 396 bytes, patch		Details \| Diff \| Splinter Review
fix a typo 18 years ago Axel Hecht [:Pike] 3.17 KB, patch		Details \| Diff \| Splinter Review
merge bug 391680, Paths and filter.py 18 years ago Axel Hecht [:Pike] 10.91 KB, patch		Details \| Diff \| Splinter Review
factor out common parts of CompareCollector, subclass for the rest 18 years ago Axel Hecht [:Pike] 2.09 KB, patch		Details \| Diff \| Splinter Review
merging the more standards-compliant regexps from trunk 18 years ago Axel Hecht [:Pike] 1.81 KB, patch		Details \| Diff \| Splinter Review
add back a bookmarksparser 18 years ago Axel Hecht [:Pike] 4.55 KB, patch		Details \| Diff \| Splinter Review
complete patch for l10n-merge 18 years ago Axel Hecht [:Pike] 27.60 KB, patch		Details \| Diff \| Splinter Review

Axel Hecht [:Pike]

Reporter

Description

•

18 years ago

We need a tool to merge l10n-impact changes in en-US into an existing localization, both for localization work, and for the build process. If we can include such a tool into the build, we could generate builds for testing independently of the current status of a localization, and make that process much more agile. Dynamis was having an intern working on this, can you report on the current status?

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Updated

•

18 years ago

Blocks: 398954

Armen [:armenzg]

Comment 1

•

18 years ago

Hello all, my name's Armen and I'm working with Rueen Fiez and Vincent Lam on this tool; You can see our progress on the following website: http://zenit.senecac.on.ca/wiki/index.php/Automated_localization_build_tool We have no knowledge of python, we don't know how are the l10n tools are used and we are strugglinh with a lot of assignments, please understand that we wrote it from scratch based on the examples that Axel has sent. Said that this is our initial testing that we got after six hours "trying to code something" rather than keep on reading without knowing what to learn http://docs.google.com/Doc?id=dnkkbhp_65p4whj5 During this week we will have to have something working for our open source course, so if you can give us some feedback, we will really appreciate. Regards, Armen on behalf of our team PS = Dynamis could you please post somewhere some of your code? or a blog? or somehting PS2 = why is it called autol10n-MERGE the tool? why MERGE?

Armen [:armenzg]

Comment 2

•

18 years ago

Attached file Pur first attempt; the latest source on the Google Doc from our website (obsolete) — Details

Armen [:armenzg]

Comment 3

•

18 years ago

Attached file brief instructions on how to make run our little attempt to make an l10n-merge tool (obsolete) — Details

I have added some instructions on how to use the first attachment to this bug. This Tuesday, Oct 23rd, we are going to work more on our tool adding the functionality to go over subfolders and making more changes to all the DTD and properties files. We would also like to add more flexibility with the regular expressions Please any feedback will be much appreciated, we will be on #seneca and #l10n Armen, Rueen and Vincent

Axel Hecht [:Pike]

Reporter

Comment 4

•

18 years ago

Hi Armen, to me, the l10n-merge tool and the code you're working on are two separate tools. They hopefully would be able to share code on the side of writing changes, but that's about it. Regarding the 0.2 version, I looked at the script on Google Docs, and I bet I found a bug. Take a look at http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/browser/locales/en-US/chrome/browser/setDesktopBackground.dtd&rev=1.1&mark=6, where 'color' shows up in the entity name. This is one of the reasons why I'd want your tool to work on top of the parsing arch and only postprocess the values, and not the complete files. Sadly my poking for the actual l10n-merge tool works didn't turn out too successful so far.

Armen [:armenzg]

Comment 5

•

18 years ago

Here is our 0.2.2 attempt from last week and the Google Docs code wasn't updated: http://matrix.senecac.on.ca/~azambran/mozilla/l10nmerge0.2.2.zip I have added this line: return re.sub(r'(.*)([Cc])olor', r'\1\2olour', instring) I don't know how is the right way to put code and discuss about it. Axel, do you want us to file another bug for our project?

Chris Cooper [:coop] (he/him)

Comment 6

•

18 years ago

Pike: John and I were just discussing this, or at least I think we were. When you speak of the l10n-merge tool, are you looking for something that will create a diff of string resources between the previous build and the current build, and mark any omissions with easily searchable placeholder text for localizers to find? Just looking for a little more direction here, if you haven't heard anything more from dynamis.

dynamis (Tomoya ASAI)

Comment 7

•

18 years ago

Hi all. Sorry coming late. I will attach my l10n-merge script based on compare-locale.py. What the script will do: - supporting dtd, properties and inc files - keep white space and comments - keep entity order - add new en-US entities at the end of file (and their comment) - comment out obsolate l10n entities (and their comment) # or remove them with -c option - copy new en-US file to l10n - remove obsolate file # or backup them with -b option (filename -> filename~) workflow for each file is: 1. split en-US/l10n file into the 'entity blocks' and make key table 2. compare key en-US/l10n key table 3. comment out or remove obsolate l10n 'entity blocks' 4. add new en-US entity at the end of l10n file entity block is: - First comment block of the file is considered as 'header' and don't touch it. # It is in many case license block and not comment for the first # entity. So, we don't consider it part of the first entity block. - Preceding white spaces and comments of each entity will be included - Following comment in the same line will also included # we sometimes add comment after the entity in the same line example: en-US: 1:  2:  3: <!ENTITY entity1 "value1"> 4: 5:  6: <!ENTITY entity2 "value2">  7:  8: <!ENTITY entity3 "value3"> header: line1 entity block1: line2~3 entity block2: line4~6 entity block3: line7~8 if l10n is: 1:  2:  3: <!ENTITY entity1 "value1"> 4:  5: <!ENTITY entity3 "value3"> 6: <!ENTITY entity4 "value4"> merged l10n will be: 1:  2:  3: <!ENTITY entity1 "value1"> 4: 5:  6:  6: <!ENTITY entity2 "value2">  7:  8:

dynamis (Tomoya ASAI)

Comment 8

•

18 years ago

Attached file l10n-merge script package — Details

Our l10n-merge python script based on Axel's compare-locale script This l10n-merge script may still have some bugs but as far as I tested working fine.

dynamis (Tomoya ASAI)

Comment 9

•

18 years ago

oops, line 8 was just my typo :( > 8: <!-- XXX l10n merge: new entities (Wed Oct 31 03:38:16 2007) obsolate entities, not new entities. This is just my typo here, not bug of the script. # I wrote above example by hand.

Axel Hecht [:Pike]

Reporter

Comment 10

•

18 years ago

Thanks to Toshihiro Kura and dynamis for getting us this far. I have been looking at the code already, and we'll need to do some merging and factorization. Paths.py and Parser.py need to be merged, and we should factor the common code between CompareLocales.py and L10nMerge.py as well as compare-locales and l10n-merge.py. There are a few minor nits like spelling fixes, and the options should be a param, those are simple, though. (In reply to comment #5) > > I don't know how is the right way to put code and discuss about it. I suggest that you take a look at L10nMerge.py. Yes, we're going to do some changes to it, but it should give you a really good idea on how to modify just the entity values. > Axel, do you want us to file another bug for our project? Yes, please.

dynamis (Tomoya ASAI)

Comment 11

•

18 years ago

(In reply to comment #10) > I have been looking at the code already, and we'll need to do some merging and > factorization. Paths.py and Parser.py need to be merged, and we should factor > the common code between CompareLocales.py and L10nMerge.py as well as > compare-locales and l10n-merge.py. We didn't have enough time and didn't care enough about factorization. I concentrated to make the script works strictly as I want. I don't enough time now and if you merge common parts of l10n-merge and compare-locales, it's very nice. To make the script works strictly, most important and complex part of the code is regular expression in Parser.py I think. Regular expression of compare-locale was not enough strict (in fact bug found) and I completely rewrote them. Call me when you finished merging l10n-merge and compare-locales. I'll check if merged one still works as I expected. And if you will not share my regular expression for compare-locales, I will look latest compare-locales too and make patch for the minor bug (if it still exists). Thanks.

dynamis (Tomoya ASAI)

Comment 12

•

18 years ago

Attached patch patch to fix treatment of ignored entities (obsolete) — Details — Splinter Review

My peer found the bug of L10nMerge.py and this is the patch to fix it. Without this patch, l10n-merge will comment out l10n only ignored entities. For example browser.search.order.[3-9] in region.properties. They are defined only in l10n files but should not be commented out as obsolate.

Axel Hecht [:Pike]

Reporter

Comment 13

•

18 years ago

Attached patch fix a typo (obsolete) — Details — Splinter Review

I have a bunch of patches already, I should shovel those out the wild. I'm currently having my own little bzr rep. I'd say that I basically reviewed the original zip by dynamis, if someone would volunteer to review the merge patches, that'd be cool. I'll be attaching a final patch against mozilla/testing/tests/l10n in the end. First patch is a spelling fix.

Assignee: nobody → l10n

Status: NEW → ASSIGNED

Attachment #288008 - Flags: review?

Axel Hecht [:Pike]

Reporter

Comment 14

•

18 years ago

Attached patch merge bug 391680, Paths and filter.py (obsolete) — Details — Splinter Review

Dynamis, this one should be actually interesting to review, it's incorporating bug 391680, which changes the way I ignore entities and files, and it may, sorry for that, obsolete the patch you just attached.

Attachment #288010 - Flags: review?

Axel Hecht [:Pike]

Reporter

Updated

•

18 years ago

Attachment #288010 - Flags: review? → review?(bugzilla)

Axel Hecht [:Pike]

Reporter

Comment 15

•

18 years ago

Attached patch factor out common parts of CompareCollector, subclass for the rest (obsolete) — Details — Splinter Review

More merge foo, CompareCollector should stay unchanged, and merge should just subclass that one.

Axel Hecht [:Pike]

Reporter

Updated

•

18 years ago

Blocks: 403215

Axel Hecht [:Pike]

Reporter

Comment 16

•

18 years ago

Attached patch merging the more standards-compliant regexps from trunk (obsolete) — Details — Splinter Review

Axel Hecht [:Pike]

Reporter

Comment 17

•

18 years ago

Attached patch add back a bookmarksparser (obsolete) — Details — Splinter Review

I added a new bookmarks parser, the old one seems to be a tad too picky (busting es-AR, at least). The new one is based on HTMLParser.HTMLParser, and loosely on what Clint did for Minotaur. I think this one is a tad more strict, though. The bookmarks parser can't do merges yet, I think that will be a good starting point for the l10n-fork team. With a small nit - os.mkdir(os.path.dirname(l_fl)) + os.makedirs(os.path.dirname(l_fl)) the code is now able to create new localizations from sparse ones, in the case of en-IN, sparse-en-IN/browser/profile/bookmarks.html sparse-en-IN/browser/searchplugins/a1books.xml sparse-en-IN/browser/searchplugins/amazon-en-GB.xml sparse-en-IN/browser/searchplugins/eBay-en-IN.xml sparse-en-IN/browser/searchplugins/list.txt sparse-en-IN/browser/searchplugins/yahoo-en-IN.xml sparse-en-IN/other-licenses/branding/firefox/brand.properties sparse-en-IN/toolkit/chrome/global/intl.properties Making that work good might still take a bit. Remaining tasks are - unbust compare-locales, CompareLocales - actually share common code - some further clean up, at least the global options var needs to die. PS: Using yield in the __iter__ function of Parser is likely a change that should be made for other parser impls, too. Not necessarily in this bug, though.

Axel Hecht [:Pike]

Reporter

Comment 18

•

18 years ago

Attached patch complete patch for l10n-merge — Details — Splinter Review

Complete patch for l10n-merge to mozilla/testing/tests/l10n. I factored more code, and I all but the parser patches on L10N_MERGE_BRANCH incrementally, to ease review. http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=L10N_MERGE_BRANCH&branchtype=match&dir=%2Fmozilla%2Ftesting%2Ftests%2Fl10n%2F&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-11-19&maxdate=2007-11-21&cvsroot=%2Fcvsroot has the check-ins between the trunk and the current tip on that branch. This patch is actually nothing but a cvs update -j L10N_MERGE_BASE -j L10N_MERGE_BRANCH I left a few things up to volunteers or follow-up bugs: - for some use-cases, I'd prefer to add entities without comments, in particular, for build purposes - for build automation as well, I think it's be better to have the generated files in a separate dir. I'm not so fond of cvs update -C, it doesn't always work for me, it might leave dangling files, and I expect it to be more time intensive, too - there's more room for factorization, in particular between CompareLocales.compare and L10nMerge.merge - BookmarksParser can't merge yet, l10n-fork will need that, though - Parser should use generators for __iter__, like BookmarksParser does now Dynamis offered a review, I'd like to have a review from our python tool crew, too, any volunteers?

Attachment #288005 - Attachment is obsolete: true

Attachment #288008 - Attachment is obsolete: true

Attachment #288010 - Attachment is obsolete: true

Attachment #288828 - Attachment is obsolete: true

Attachment #289389 - Attachment is obsolete: true

Attachment #289393 - Attachment is obsolete: true

Attachment #289508 - Flags: review?(bugzilla)

Attachment #288008 - Flags: review?

Attachment #288010 - Flags: review?(bugzilla)

Armen [:armenzg]

Updated

•

18 years ago

Attachment #284789 - Attachment is obsolete: true

Armen [:armenzg]

Updated

•

18 years ago

Attachment #285695 - Attachment is obsolete: true

Axel Hecht [:Pike]

Reporter

Comment 19

•

18 years ago

Comment on attachment 289508 [details] [diff] [review] complete patch for l10n-merge Benjamin, could you do a review on this? The original code from dynamis and his intern has a review + comments, where the comments are actually implemented in the follow up patches I checked in on my branch. Comment 18 still has the round up description.

Attachment #289508 - Flags: review?(benjamin)

Axel Hecht [:Pike]

Reporter

Comment 20

•

18 years ago

Comment on attachment 289508 [details] [diff] [review] complete patch for l10n-merge Cancelling review requests, my own tree is looking different now.

Attachment #289508 - Flags: review?(bugzilla)

Attachment #289508 - Flags: review?(benjamin)

Axel Hecht [:Pike]

Reporter

Comment 21

•

17 years ago

Status update, gandalf and I have forked code that could prove useful. More on gandalf's code is on http://wiki.braniecki.net/Projects:mozpyl10n, my code is at http://hg.mozilla.org/users/axel_mozilla.com/tooling/. Some of the differences might have performance impact, so I'm not sure if both are equally well suited to be run as part of the build. Gandalf signed up for making his code feature-compat so that we can have a perf-shoot-out. And I owe him a beer if his code wins. Re-assigning to gandalf for that action item.

Assignee: l10n → gandalf

Status: ASSIGNED → NEW

Axel Hecht [:Pike]

Reporter

Comment 22

•

16 years ago

Marking FIXED, we have at least one 1l0n-merge in production now.

Status: NEW → RESOLVED

Closed: 16 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.