Closed Bug 180508 Opened 22 years ago Closed 18 years ago

Incorrect naming of US locale packages

Categories

(Core :: Internationalization: Localization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dolmen, Assigned: rchen)

References

Details

(Keywords: intl)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; fr-FR; rv:1.1) Gecko/20020826
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; fr-FR; rv:1.1) Gecko/20020826

The standard "english" language package is in a package named "en-US". It should
be named "en".
The standard "english/United States" region package is in a package named "US".
It should be named "en-US".
(I'm not talking about just the Jar file, but the whole
bin/chrome/locale/{US|en-US})

The generic scheme is that the name of the region package should be named as a
combination of language code and locale code. The language package must be named
as the language code.


The problem with the current bad naming is that it influences the organisation
of translation projects in a wrong way. Then this creates problems to create
alternative languages for a region package.


Reproducible: Always

Steps to Reproduce:
Hi,
Yes it should be named en and not en-us that problem exists in 2002111606 build
of 1.0.2 also.  Delete that language from the preferences and the browser should
work fine with out it if engilsh US is your operating system lauguage.
"Then this creates problems to create
alternative languages for a region package."

I mean alternative combinations of region/language packages.

The point is that region packages contains mostly links to web sites. This web
sites are usually not language neutrals.
In multilingual regions (Swiss, Canada...) there should be different region pack
for the same region code.

Example: Swiss
- three lang packs: de, fr, it
- three region packs: de-CH, fr-CH, it-CH

The current naming of US packages lead french people to create the following
packages:
- lang pack : fr-FR (correponding to en-US)
- region pack : FR (corresponding to US)
where the correct scheme would have been:
- lang pack : fr
- region packs : fr-FR, fr-CA, fr-BE, fr-CH
> The standard "english" language package is in a
> package named "en-US". It should be named "en".

No, it's the 'US English' language pack, and must be named 'en-US'. 'en-US' is a
perfectly valid RFC 3066 language tag. The 'UK English' language pack shold be
named 'en-GB'.

> The standard "english/United States" region package
> is in a package named "US".
> It should be named "en-US".

This, on the other hand, is correct.

I'm confirming this bug, but must insist that the language tag for the US
English (i.e. default) language tag *stays* at 'en-US', while the region code is
changed from 'US' to 'en-US'.
> The standard "english" language package is in a
> package named "en-US". It should be named "en".

No, it's the 'US English' language pack, and must be named 'en-US'. 'en-US' is a
perfectly valid RFC 3066 language tag. The 'UK English' language pack shold be
named 'en-GB'.

> The standard "english/United States" region package
> is in a package named "US".
> It should be named "en-US".

This, on the other hand, is correct.

I'm confirming this bug, but must insist that the language tag for the US
English (i.e. default) language tag *stays* at 'en-US', while the region code is
changed from 'US' to 'en-US'.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Why is the US in accept language written in lower case, whereas it's in upper
case in the user agent ?

User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.2b) Gecko/20021016
X-Accept-Language: it, en, en-us
Here is an interesting page about "Issues and Advantages of the use of Locales
in Software" :

http://www.i18nguy.com/locales/
Bryan, Mathias: this bug is not about the language content negotiation.

This bug is about l10n of Mozilla (you know, translation of Mozilla itself).
May I suggest naming the language packs lang-en-US, lang-en-GB, lang-fr and the
region packs region-en-US, region-en-US, region-fr-FR, region-fr-CH, ... ?

It would be easier for everyone and avoid further confusion.
I think both language code and regional code are still imperfect. I translate
into Sorbian. There is only one code for 2 Sorbian languages (wen for Upper
Sorbian and Lower Sorbian). At present I translate into Upper Sorbian only. But
it isn't out of question that there will be a Lower Sorbian language pack in
future. How distinguish them? Otherwise the regional code is DE since both
Sorbian languages are spoken in Germany. But if anybody has installed DE.jar
that belongs to a a German not to a Sorbian locale. His DE.jar is possibly
overwritten although he doesn't want to do that. Well, I would need: lang-usb-DE
and lang-lsb-DE and reg-usb-DE and reg-lsb-DE. Another possibility would be to
use names of real regions. At present the region code is a country code. I would
need a code for Upper Lusatia and Lower Lusatia, the 2 regions where the Sorbian
languages are spoken. But, a disadvantage would be that the number of codes
would multiply.
*** Bug 102509 has been marked as a duplicate of this bug. ***
We need to keep zh-TW and ah-CN in this discussion too. They really complicate
things.

zh-TW and zh-CN are two different languages, so the language packs have to be
four character identifies. zh is NOT a language (like en is)
The names of the main language packs basically seem to be OK, based on numerous
discussions, also in here. The problems are the region packages.

IMHO, we can go two ways from here:
1) reintegrate region packs with language packs and out them into the same .jar
file, and use the same language code for both.
2) rename region packs to have better names, best prefixed with reg[ion]-* or
cont[ent]-*

I'd really prefer to go by 1), as I'm not convinced that we need them seperated.
The Netscape idea that the whole region/content pack concept was based upon
never took off (as I discussed at my FOSDEM slides this year). It's basically a
good concept to keep the resources seperated (URLs in different internal files
than normal strings), but I'm not convinced we have to complicate files and UI
for that.
(In reply to comment #12)
> IMHO, we can go two ways from here:
> 1) reintegrate region packs with language packs and out them into the same .jar
> file, and use the same language code for both.
> 2) rename region packs to have better names, best prefixed with reg[ion]-* or
> cont[ent]-*

Robert, 
I do not think that joining language and regional packs is a good idea. 
Theoretically, there should be much more regional packs, while the number of
language packs is limited to the number of languages. We should not mix this,
please. Some localisers just want to translate the application, while they do
not care about the regional pack --- they should not feel that they left
something of their language pack; instead, some other team may come up with the
regional pack... 

Since is no problem with installing multiple xpi's from the same link, we should
not join these packs. I think it will be better to go by 2). 
Keywords: intl
I don't care what we theoretically have... I care for what reality says.
And in reality,
1) not using a localized region pack causes incomplete localization,
2) I know only of cases where we have one region pack for one localization,
3) there's nothing really region (country, ...) specific in region/content packs,
3) localizers almost anywhere are delivering region packs with the localizations,
4) Your argument of comment #13 would be the same for the currently discussed
issue of platform packs, and
5) there's no really good argument to have seperated packs for language and
region/content other than what Netscape originally thought. And that idea (of
having companies deliver their customized "content" in a seperated pack) never
really took off.

On the other hand, it would simplify quite a few things to just make region
packs go away, it might even make it easier to fix some bugs.

That's why I'm more for merging the packs.
Either way we go, it would be better than now though.
(In reply to comment #13)
> I do not think that joining language and regional packs is a good idea. 
> Theoretically, there should be much more regional packs, while the number of
> language packs is limited to the number of languages. We should not mix this,
> please. Some localisers just want to translate the application, while they do
> not care about the regional pack --- they should not feel that they left
> something of their language pack; instead, some other team may come up with the
> regional pack... 
I think you are miscounting the number of languages and regions in the world.
There are around 7000 languages in the world, and far fewer countries.
That is, of course, assuming that regional packs would be based on country-level
regions, but I can't imagine anyone wanting to do a smaller region.
In South Africa we have 11 official languages and localizations of Mozilla into
6 of those. It's hard to define what a "regional" pack would contain, that would
be applicable to all the languages.
So I support merging the region packs into the language packs.
The question is, can/should this be done for the default US region pack?
That would lead the way for other localizations...
(In reply to comment #15)
> I think you are miscounting the number of languages and regions in the world.
> There are around 7000 languages in the world, and far fewer countries.

David, some languages are spoken in quite a few regions, although the language
itself is the same, regional content may differ. I.e., it will be just perfectly
fine to have ru-US, es-US, fr-US regional packs, while it makes no sense for
such language packs, because those mentioned languages are not regulated on the
US territory, nevertheless, there are some areas in the US, where people only
speak Russian.

If no one yet creates such regional packs, does not mean that the idea is bad. 

David, will you have any objections about renaming US to en-US in the regional
pack? Have you had any problems with renaming it to include both language and
regional code for you language? 
Constantine:
Discussiong it that way, we should probably be clear about what "regional"
content there is in those packs. Well, it's some default URLs, basically, and
some things that a distributor might want to change. The "content pack"
description says more what it's actually about. "region" is more or less a
misnomer, IMHO.
I guess NSCP ppl chose that name so that they could smuggle that idea into
open-sourced Mozilla more easily, but actually it doesn't describe what's in
there right now. It describes an idea that took off even less than the content
pack idea, which wasn't followed a lot itself either.

The whole thing was an interesting idea basically, but the benefit in practice
is much smaller than the cost.
And users don't really know why they should switch two things in the UI, when
they just want to change the language.

IMO, we should change the internal code from US to en-US and integrate the
contents into en-US.jar, following that we can make the pref panel more like the
themes panel, so that it's more easy for users...
(In reply to comment #14)
> I don't care what we theoretically have... I care for what reality says.
> And in reality,
> 1) not using a localized region pack causes incomplete localization,
> 2) I know only of cases where we have one region pack for one localization,

I know of cases where we do not have a regional pack for a localisation. This
change will force those localisers to translate it, if they still want to be
consistent with en-US. I don't think it will be easy on those localisers, so I
rather not see merging step. 

> 3) there's nothing really region (country, ...) specific in region/content packs,

What about bookmarks? They sure are specific to the region. 

> 3) localizers almost anywhere are delivering region packs with the localizations,

No, about 1/3 of localisations, as of Mozilla 1.6, do not contain regional pack
in the default language xpi. 

> 4) Your argument of comment #13 would be the same for the currently discussed
> issue of platform packs, and

I did not get this point...

> 5) there's no really good argument to have seperated packs for language and
> region/content other than what Netscape originally thought. And that idea (of
> having companies deliver their customized "content" in a seperated pack) never
> really took off.

I think, this is because of the bad naming... They named it after the region,
which is wrong. If we fix it, there might be some volunteers to create regional
packs. 

Merging step will also violate the idea of modularisation. See bug #230596 and
bug #234261. We should go step 2 (renaming). 
(In reply to comment #18)
> I know of cases where we do not have a regional pack for a localisation. This
> change will force those localisers to translate it, if they still want to be
> consistent with en-US. I don't think it will be easy on those localisers, so I
> rather not see merging step. 

They can just take the en-US contents and include them (think "keep original").

> What about bookmarks? They sure are specific to the region. 

Well, they aren't contained in US.jar, and that's the main thing I'm talking about.

> No, about 1/3 of localisations, as of Mozilla 1.6, do not contain regional pack
> in the default language xpi. 

most of that 1/3 have a seperate regxx.xpi though. The contents of US.jar are
easy to include, if they don't want to change them, they can just select "keep
original" in MT or whatever tool they use.

> > 4) Your argument of comment #13 would be the same for the currently discussed
> > issue of platform packs, and
> 
> I did not get this point...

You said "instead, some other team may come up with the regional pack" -
currently, it's basically the same with platform packs. Those are already a bit
more integrated though, so it's easier to manage currently than the region pack
that is treated in a very speicial and complicated way (a seperate locale that
actually isn't a locale though).

> I think, this is because of the bad naming... They named it after the region,
> which is wrong. If we fix it, there might be some volunteers to create regional
> packs. 

I'm too long around here that I could believe that.

> Merging step will also violate the idea of modularisation. See bug #230596 and
> bug #234261. We should go step 2 (renaming). 

In both of those bugs, I'm not really interested in making that happen. Everyone
can do that with curent infrastructure if (s)he wants.
Modularisation is good as long as it has a purpose. We have quite high
modularisation on the chrome registry level, we have lots of different componets
there, which I believe to be a good thing. We always can provide localization
for only some part of those components (represented as different directories in
the .jar files), and Mozilla will fall back to the default (normally en-US/US)
for the others. I've got no problem with keeping the seperated *-region
components in chrome registry, if name and file contents get merged.
I just think we need to go away with having something that looks like a solution
where there's technically no real problem behind it, and create more problems
with the solution than we really could solve.

I wasn't for that solution when NSCP ppl came up with the propsal, and the only
reason they could tell me that really was for that way was marketing. And you
couldn't *ever* discuss NSCP marketing. They often had plain wrong ideas, and
they didn't ever want to discuss them. That was one reason why Seamonkey UI
couldn't really move, and one main reason that UI development more or less died
(besides mpt, but I don't want to discuss that here).
Anyways, I remember their move to make my localization work much more
complicated with that region pack idea, and I never saw a real benefit.
As I said, that someone doesn't want to localize it is nor real argument,
because if you leave out localization of inspector, venkman, composer(editor),
we already good enough that we fall back to our default locale that (hopefully)
is installed. So if somebody wants to leave *-region untouched, they can, even
if they're in the en-US.jar

BTW, if we need less files to be loaded at startup, we might even see faster
startup...
kairo@kairo.at writes in comment #14 :

> 3) there's nothing really region (country, ...)
> specific in region/content packs,

This particular item, IMVHO, is not true.

See the sensible info the Form Manager collects
communicator-region/wallet/*.xul
 (although I strongly doubt about the usefulness
 of the current Form Manager version for MAS,
 but that's another story / bug :)

The address book "find an address" service (map):
messenger-region/region.properties

Now Mozilla Foundation Europe is trying to take
off, wouldn't a nice idea deliver different
contents packages?

> On the other hand, it would simplify quite a few
> things to just make region packs go away, it might
> even make it easier to fix some bugs.

You're talking about the nasty preferences page
for the selection of the active l/c pack, or there's
something else you're particularly thinking about?

(if it's not clear enough, I'd rather go for kairo
proposal 2)

in comment #17
> The whole thing was an interesting idea basically,
> but the benefit in practice is much smaller than
> the cost. And users don't really know why they
> should switch two things in the UI, when they
> just want to change the language.

Completely agree. But the problem here is we should
have a way to automatically activate an installed
package (for which the user it's likely might have
installed to also use it)

> IMO, we should change the internal code from US to
> en-US and integrate the contents into en-US.jar,
> following that we can make the pref panel more
> like the themes panel, so that it's more easy for
> users...

Or have the items listed in the preferences, bind to
a pair of contents and language resources, each.
In this way we could have something more approachable
for end user, while (partially) maintaining the
modularity in the back-end.


And, yes, to get out something useful from the thing
(contents resources splitted from the main localizable
bundle) would require people pushing the idea (mostly
to who could be interested in creating custom contents).
I'm not completely sure, but PTC might have done
this kind of work embedding Gecko in their products.
Who knows, maybe even IBM..
My point here as contributor of es-ES localization, is that we do the work for
lang and regional pack (as well as the whole rest of teams, I think), and
distribute them separately *only* for being consistant with the en-US structure,
but we'd like all to be integrated into one file, because AFAIK, nobody has ever
used the regional file "standalone", but always as a whole together with the
langpack.

Apart, I'd like to extend the issue to _ALL_ translatable packages included in
_ANY_ package. For instance, chatzilla lang pack comes as chatzilla.jar instead
of chatzillaenus.jar or similar. Since we also provide chatzilla translations,
we have been forced to name the file chatzillaeses.jar in order not to break the
installation, and the same applies for inspector, venkman, and the whole bunch
of software being developed at mozdev.org.
(In reply to comment #20)
> See the sensible info the Form Manager collects
> communicator-region/wallet/*.xul
>  (although I strongly doubt about the usefulness
>  of the current Form Manager version for MAS,
>  but that's another story / bug :)

Well, I consider wallet not to be working to well, and there's even a
long-standing bug about L12y in wallet, IIRC.
Additionally, I don't feel comfortable about XUL in a locale package anyways
(and I never have touched any of those files for L10n).

> The address book "find an address" service (map):
> messenger-region/region.properties

Wow... One single thing that's dependent on a real region? Probably we could
argue if this should be configurable for the user himself ;-)

> Now Mozilla Foundation Europe is trying to take
> off, wouldn't a nice idea deliver different
> contents packages?

I consider that a dream, I don't think we will see multiple content packs per
Language as a common case any time soon.

> You're talking about the nasty preferences page
> for the selection of the active l/c pack, or there's
> something else you're particularly thinking about?

I'm taking of creating the packs and maintaining them. I don't think it's worth
doing two different files with two structures and register them differently with
different things to note in the contents.rdf, it too much work for something
that I don't see much worth in.
Having a single locale code, packing all the *-region directories in the same
.jar, and leave the modularity at chrome registry level would be easier to
maintain, IMHO.

> Or have the items listed in the preferences, bind to
> a pair of contents and language resources, each.
> In this way we could have something more approachable
> for end user, while (partially) maintaining the
> modularity in the back-end.

Well, that's quite similar to what I'm thinking of. If content and language pack
have the same locale code (and then it's a small and easy move to just pack them
into the same file), the UI can activate both at once and the user can switch
them with one single selection.
They could still install an additional pack with contents for a different region
and activate it's contents (the chrome registry has no problem with that, we
should look into how to mak it possible in UI so that it looks smooth. We have
simliar problems already with what I'd call "partially activated locale packs"
but that's a different discussion basically.)

> And, yes, to get out something useful from the thing
> (contents resources splitted from the main localizable
> bundle) would require people pushing the idea (mostly
> to who could be interested in creating custom contents).

True. And I never saw anybody outside Netscape who really was fond of the idea
and would have been likely to push it. As even Netscape didn't push it, I'd
doubt to see anyone else doing it.

(In reply to comment #21)
> Since we also provide chatzilla translations,
> we have been forced to name the file chatzillaeses.jar in order not to break the
> installation, and the same applies for inspector, venkman, and the whole bunch
> of software being developed at mozdev.org.

I don't understand that issue. I'm packing locale resources for chatzilla,
venkman, inspector, and calendar into the same de-AT.jar as the translation of
en-US.jar - it's no problem to do that, it just needs some manual/script
fiddling with your .jar file. You can have the scripts I'm using if you like,
they're available on a public server already. Just contact me via IRC or E-Mail.
In bug 102509, I have referenced a pack which does include region and platform
pack in the language pack. If you don't already know it, you might want to check
it out...
Depends on: 325473
This has been fixed by bug 325473 which merged content packs into en-US.jar - content packs are history on trunk now.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.