Closed
Bug 1443175
Opened 7 years ago
Closed 7 years ago
Clean up localization repositories: obsolete files, comments-only files
Categories
(Mozilla Localizations :: Other, defect)
Mozilla Localizations
Other
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: flod, Assigned: flod)
Details
Attachments
(1 file)
I'd like to run a script to clean up existing localization repositories from years of mess.
Reason:
* Pootle, unlike Pontoon, used to commit files with only comments to the repository.
* Some locales enabled projects like Thunderbird or SeaMonkey years ago, and files have been lingering in the repository for years.
Strategy:
* Only look at localizable files: .dtd, .properties, .ini, .inc, .ftl
* Remove obsolete files, i.e. localizable files that are not available in gecko-strings
* Parse all localizable files, and remove files that don't include any parsable string
The reason to not look at all files is that we have legitimate files around (e.g. dictionaries).
I've been running this script on a few locales without noticing particular issues, but I'd like a sanity check before running it on all repository.
Example: https://hg.mozilla.org/l10n-central/af/rev/c29ec316124a
Assignee | ||
Comment 1•7 years ago
|
||
Attachment #8956111 -
Flags: feedback?(l10n)
Updated•7 years ago
|
Attachment #8956111 -
Attachment mime type: text/x-python-script → text/plain
Comment 2•7 years ago
|
||
Comment on attachment 8956111 [details]
clean_hg_repository.py
lgtm.
Attachment #8956111 -
Flags: feedback?(l10n) → feedback+
Assignee | ||
Comment 3•7 years ago
|
||
Thanks, I've run the script on all repository, spot checking some of them, and couldn't find anything wrong.
I will likely run this script from time to time.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Comment 4•6 years ago
|
||
Flod just stumbled over then when checking the suite l10n de directory. Could you please back out all changes made to the l10n suite directories. We still need the removed localizations in suite for SeaMonkey 2.57. We are late with this one and need to take the central ones.
Flags: needinfo?(francesco.lodolo)
Assignee | ||
Comment 5•6 years ago
|
||
(In reply to Frank-Rainer Grahl (:frg) from comment #4)
> Flod just stumbled over then when checking the suite l10n de directory.
> Could you please back out all changes made to the l10n suite directories. We
> still need the removed localizations in suite for SeaMonkey 2.57. We are
> late with this one and need to take the central ones.
I can't back out changes only for one folder, and this has been done for almost 9 months by now (4 times).
SeaMonkey is localized using the cross-channel repository as a base, and that covers central, beta, and release. If these files have been removed, they are also not available for localization in tools like Pontoon. The same is true for shared files, which I assume SeaMonkey needs, so I'm not sure exactly sure how that would work.
I'm sorry, but I don't see how I can help. If really needed, I can exclude /suite files from future cleanups.
Flags: needinfo?(francesco.lodolo)
Comment 6•6 years ago
|
||
We had a solution for the shared files until now. Please exclude suite from further cleanups.
Flags: needinfo?(francesco.lodolo)
Assignee | ||
Comment 7•6 years ago
|
||
OK, already updated the script to exclude /suite files.
Flags: needinfo?(francesco.lodolo)
Assignee | ||
Comment 8•6 years ago
|
||
(In reply to Francesco Lodolo [:flod] from comment #7)
> OK, already updated the script to exclude /suite files.
Side note: this won't prevent localizers from removing them on their own, especially if they work on hg directly, and see these files reported as obsolete.
Comment 9•6 years ago
|
||
> and see these files reported as obsolete.
Yes but at least for de files still in use in the current tree were removed. eg. editBookmarkOverlay.dtd. The script is clearly broken.
Comment 10•6 years ago
|
||
How does the script detect if files are in use? If we take:
https://hg.mozilla.org/l10n-central/en-GB/rev/d1009e2538b4
This removes both brand.properties and brand.dtd which both seem to be very much in use:
https://dxr.mozilla.org/comm-central/search?q=brand.properties+path%3Asuite&redirect=false
https://dxr.mozilla.org/comm-central/search?q=brand.dtd+path%3Asuite&redirect=false
or am I missing something?
Assignee | ||
Comment 11•6 years ago
|
||
It uses the cross-channel repository as a reference. If the file is missing there, it's obsolete and removed
https://hg.mozilla.org/l10n/gecko-strings/
Please look at the paths before assuming that the script is broken.
(In reply to Frank-Rainer Grahl (:frg) from comment #9)
> > and see these files reported as obsolete.
>
> Yes but at least for de files still in use in the current tree were removed.
> eg. editBookmarkOverlay.dtd. The script is clearly broken.
comm-central has comm/suite/locales/en-US/chrome/common/places/editBookmarkOverlay.dtd which maps to suite/chrome/common/places/editBookmarkOverlay.dtd in l10n repositories.
For French
https://hg.mozilla.org/l10n-central/fr/file/tip/suite/chrome/common/places/editBookmarkOverlay.dtd
That's a brand new file in a different path. If the content of the file was the same, nobody actually bothered to go through l10n repos and copy the file over the new path to avoid localizers seeing it as a new file (and translating it from scratch).
For German this file was removed: suite/chrome/common/bookmarks/editBookmarkOverlay.dtd (note the /places subfolder)
In this changeset: https://hg.mozilla.org/l10n-central/de/rev/440235f6e077
It's not available in German because nobody is working on that localization, as far as I can tell from the history of the repository.
The same goes for branding
https://hg.mozilla.org/comm-central/rev/0d275d681a5d
Comment 12•6 years ago
|
||
(In reply to Francesco Lodolo [:flod] from comment #11)
> It uses the cross-channel repository as a reference. If the file is missing
> there, it's obsolete and removed
> https://hg.mozilla.org/l10n/gecko-strings/
I presume that repo was populated from somewhere?
When an en-US localisable file is worked on does it automagically make changes to the gecko-strings repo or does something extra have to be done?
Assignee | ||
Comment 13•6 years ago
|
||
That repo includes all strings that exist in mozilla/comm central/beta/release. For more details see the old announcement post
https://groups.google.com/d/msg/mozilla.dev.l10n/_K2j7Sg0Orw/5A4GzHAKAAAJ
Generation is done via scripts manually (for now), typically once or twice a week. You can get an idea from the pushlog.
Comment 14•6 years ago
|
||
I see that for TB it has picked up mail/branding/thunderbird/locales/en-US
but not the equivalent for SM of suite/branding/seamonkey/locales/en-US
Is that from the l10n.ini file in suite/locales ?
Comment 15•6 years ago
|
||
(In reply to Frank-Rainer Grahl (:frg) from comment #4)
> Flod just stumbled over then when checking the suite l10n de directory.
> Could you please back out all changes made to the l10n suite directories. We
> still need the removed localizations in suite for SeaMonkey 2.57. We are
> late with this one and need to take the central ones.
2.57 is from https://hg.mozilla.org/releases/comm-esr60/file/tip/suite/config/version.txt ?
ESR isn't included in cross-channel so far. We intentionally did not do that for 60 because of the Fluent migration work.
I suggest to use l10n-central from the Firefox 60 revisions, which should also include the SeaMonkey work at the time. https://product-details.mozilla.org/1.0/l10n/Firefox-60.0-build2.json has a list that should be good to use, pick your subset from that?
Comment 16•6 years ago
|
||
Yes, Thunderbird has its branding directories included in l10n.toml, which SeaMonkey doesn't. So the branding is not exposed to localization or the l10n build system.
Assignee | ||
Comment 17•6 years ago
|
||
(In reply to Francesco Lodolo [:flod] from comment #7)
> OK, already updated the script to exclude /suite files.
Just to make sure we're all on the same page: I won't be removing those files, but they will still be invisible to localizers via Pontoon, because en-US doesn't have them. They won't be able to see missing strings, fix errors, and so on.
That's hardly a scenario that makes sense, given that most locales don't have access to Mercurial these days.
Comment 18•6 years ago
|
||
Flod, sorry I screwed up. the script was correct. I forgot about two changes done to comm-central. But please leave suite out of the cleanup.
You need to log in
before you can comment on or make changes to this bug.
Description
•