Closed
Bug 178491
Opened 22 years ago
Closed 18 years ago
Bring languageNames.properties up to date with IANA registry
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
RESOLVED
FIXED
People
(Reporter: smontagu, Assigned: smontagu)
References
()
Details
(Keywords: intl)
Attachments
(2 files, 2 obsolete files)
3.39 KB,
patch
|
smontagu
:
review+
alecf
:
superreview+
|
Details | Diff | Splinter Review |
1.30 KB,
patch
|
jshin1987
:
review+
|
Details | Diff | Splinter Review |
This is split off from bug 167908.
Assignee | ||
Comment 1•22 years ago
|
||
Patch by Malcom Rowe (bugzilla2@farside.demon.co.uk). Comments copied from bug 167908 comment 2: This patch updates languageNames.properties to be consistent with the latest updates to ISO 639-2, and also places the file back into language-code order. To enable Frisian in the dialog, we will also have to add an entry to intl/locale/src/language.properties, once we know what country/countries it should be placed in. The new file is from ISO 639-2, taken from 1. http://www.loc.gov/standards/iso639-2/langcodes.html, plus 2. The addition of the two extra codes (ast, x-kok) which were already in place, and the renaming of Greek, Modern to Greek (as was already done), plus 3. The re-addition of the following deprecated codes: in = Indonesian (deprecated 1989 in favour of id) ji = Yiddish (deprecated 1989 in favour of yi) sh = Serbo-Croatian (deprecated 2000) (see http://www.loc.gov/standards/iso639-2/codechanges.html) Changes from our current languageNames.properties: 0. In country-code order. 1. Many spelling/name changes, notably the following names: Bhutani -> Dzongkha Farsi -> Persian Scots Gaelic -> Gaelic Cambodian -> Khmer Greenlandic -> Kalaallisut 2. Addition of many codes, including Frisian. 3. Javanese changed from jw to jv - a known errata in ISO 639:1988. 4. Removal of: sb = Sorbian sx = Sutu I can find no reference to sb or sx ever being valid ISO 639 codes.
Comment 2•22 years ago
|
||
> sb = Sorbian > sx = Sutu These two are used in Microsoft products: http://msdn.microsoft.com/workshop/author/dhtml/reference/language_codes.asp It seems that at the time when MS adopted these, there were no ISO-639-1 abbreviations nor ISO-639-2 ones. Sorbian (Upper & Lower) now has 3-letter code (wen). Sutu still does to have any representation in ISO-639-1/2. Sutu is a variant name for one of Southern Sotho languages: http://www.ethnologue.com/show_language.asp?code=SSO Apparently MS thought it important to use this 2-letter abbreviation for Sutu until it is established. There are a few precedents of Netscape doing something similar before.
Comment 3•22 years ago
|
||
Question: Is this a proposal to add to the current visible list (through the UI dialog) **all** the ISO-639-1/2 languages? Or are we completing the list but only turning on the flag for the ones which are needed? We have been taking the latter approach up to now because the entire list will the list too long.
Assignee | ||
Comment 4•22 years ago
|
||
I agree that we should continue with the latter approach. The patch does not include any new three letter language codes, but includes all two letter codes not in the current list. Adding all the three letter codes with no two letter equivalent would make the list much longer, and I suggest we continue only adding them when someone specifically requests them.
Assignee | ||
Comment 5•22 years ago
|
||
My comments on the patch: There is now one more new two letter code in http://www.loc.gov/standards/iso639-2/codechanges.html: ii Sichuan Yi The name for "ho" should be Hiri Motu. Why do we want to retain the deprecated codes "sh" "ji" and "in"? Konkani has a standard code "kok", which we should probably use instead of "x-kok". We should try to investigate whether the name changes are acceptable in the field. As Malcolm points out in the original bug, we have already rejected the change from "Galician" to "Gallegan", see bug 127946 comment 7.
Comment 6•22 years ago
|
||
We can add back: wen = sorbian if we are willing to start adding 3-letter code from ISO-6639-2. Currently I don't see any use of 3-letter code but our code for handling accept-language headers are designed to take these as well and so it shoudl present no problem in that regard. As for sutu, we cam split the original reference to: nso = Sotho, Northern st = Sotho, Southern each having more 3.5 million speakers in South Africa.
Assignee | ||
Comment 7•22 years ago
|
||
I have posted in the netscape.public.mozilla.i18n newsgroup requesting feedback on the name changes. I have already received a comment offlist that Punjabi is correct, not Panjabi. The names in the standard derive from the Library of Congress Subject Headings, and we should not expect our needs and priorities to be identical with those of the Library of Congress.
Summary: Bring language.properties up to date with iso639-2 → Bring languageNames.properties up to date with iso639-2
Comment 8•22 years ago
|
||
With regard to cmment #7, very often, English names for languages have variants. Both Panjabi and Punjabi are known variant names for the same language. In our list, we normally list only one name and that shoud be the "preferred" name. United Nations for example recognizes both names: http://www.unhchr.ch/udhr/navigate/alpha.htm#P (Was the person who wrtoe to smontagu a Pakistani or Indian? That could also be a factor in the preferred name.) I don't mind not changing this name to Panjabi given this state of affairs -- it was there in the code before and we may not change it unless there is a compelling reason to. BTW, I don't believe ISO-639-1/2 lang names are based on LofC names. They are based on submissions from requesters with reasons provided for preferring one name over others if variants are submitted.
Status: NEW → ASSIGNED
Assignee | ||
Comment 9•22 years ago
|
||
My correspondent was from India, and "Punjabi" does seem to be the transliteration used by Punjabis. The official government sites of the Punjab in India and Pakistan are http://www.punjab.gov.in/ and http://www.punjab.gov.pk/
Assignee | ||
Comment 10•21 years ago
|
||
*** Bug 209591 has been marked as a duplicate of this bug. ***
Comment 11•21 years ago
|
||
Ok, here's another attempt at a patch. This is pretty much the same idea as described in comment 1, except that where we had an existing description, I've kept it (with two exceptions, see below). The changes from our current version are as follows: 1. It's sorted in code order (this just makes it easier to compare with the official list). 2. I've added all the missing codes. 3. Removals: in (Indonesian): deprecated 1989 in favour of id ji (Yiddish): deprecated 1989 in favour of yi sh (Serbo-Croatian): deprecated 2000 3. Code changes: jw (Javanese) to jv: known errata in ISO 639:1988 sb (Sorbian) to wen: non-standard, now using correct ISO 639-2 code. x-kok (Konkani) to kok: x-code, now using correct ISO 639-2 code. 4. Sotho/Sutu: From: st (Sesotho) and sx [not standard] (Sutu) To: st (Sotho, Southern), and nso (Sotho, Northern). (see comment 2, comment 6). 5. Description change: vo (Volapuk) changed to vo (Volap\u00fck)
Attachment #105202 -
Attachment is obsolete: true
Updated•21 years ago
|
Attachment #126544 -
Flags: review?(smontagu)
Assignee | ||
Comment 12•21 years ago
|
||
Comment on attachment 126544 [details] [diff] [review] Update languageNames.properties (non-controversial changes only) Please add back the name change from Farsi to Persian (bug 204767 comment 2). With that, r=smontagu.
Comment 13•21 years ago
|
||
As for attachment 126544 [details] [diff] [review], but includes the name change from Farsi to Persian.
Updated•21 years ago
|
Attachment #126544 -
Attachment is obsolete: true
Updated•21 years ago
|
Attachment #126587 -
Flags: superreview?(alecf)
Attachment #126587 -
Flags: review?(smontagu)
Assignee | ||
Comment 14•21 years ago
|
||
Comment on attachment 126587 [details] [diff] [review] v3 Update languageNames.properties (non-controversial changes only) r=smontagu
Attachment #126587 -
Flags: review?(smontagu) → review+
Assignee | ||
Updated•21 years ago
|
Attachment #126544 -
Flags: review?(smontagu)
Comment 15•21 years ago
|
||
Comment on attachment 126587 [details] [diff] [review] v3 Update languageNames.properties (non-controversial changes only) sr=alecf
Attachment #126587 -
Flags: superreview?(alecf) → superreview+
Comment 16•21 years ago
|
||
Checking in xpfe/global/resources/locale/en-US/languageNames.properties; /cvsroot/mozilla/xpfe/global/resources/locale/en-US/languageNames.properties,v <-- languageNames.properties new revision: 1.11; previous revision: 1.10 done
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Comment 17•21 years ago
|
||
biesi checked in the 'non-controversial' patch, but I'd like to reopen this to document the differences between the current version and the standard, so that we can decide what else (if anything) we'd like to change.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 18•21 years ago
|
||
Ok, here's a list of the remaining differences. Code We say ISO639-2 says dz Bhutani Dzongkha fo Faeroese Faroese gd Scots Gaelic Gaelic /or/ Scottish Gaelic gl Galician Gallegan [wontfix, see below] ik Inupiak Inupiaq km Cambodian Khmer lo Laothian Lao pa Punjabi Panjabi ps Pashto Pushto rm Rhaeto-Romanic Raeto-Romance rn Kirundi Rundi sg Sangro Sango si Singhalese Sinhalese ss Siswati Swati su Sudanese Sundanese Codes (el, ia, oc, to) technically also differ from the offical descriptions, but only because we truncate the description - 'Greek' rather than 'Greek, Modern (1453-)', for example. I'm not suggesting that we should change all of the above. In fact, we have already decided /not/ to change at least one of them (gl - bug 127946 comment 7). I do wonder whether any of the differences are caused because we're using the foreign-language name of the language rather than the English-language name (like using 'Deutsch' instead of 'German'). See bug 208295 for an example of this. I'm hoping we should be able to classify the remaining differences into one of four categories: 1. spelled wrong (will fix), 2. foreign-language name rather than English-language name (will fix), 3. wrong for another reason (will fix), 4. wontfix (for whatever reason). Alternatively, we could just decide which ones to fix, but this topic seems particularly contentious (no surpise), so we should document why we are or aren't changing things.
Comment 19•21 years ago
|
||
One on that list that *does* look wrong to me is Sudanese / Sundanese. From what I can see, 'Sudanese' refers to the people of Sudan, in Africa, and is not the name of a language (the Sudanese primarily speak Arabic), while 'Sundanese' appears to be the language spoken by the Sundanese in Indonesia.
Assignee | ||
Comment 20•21 years ago
|
||
I thought I had commented about "Sudanese" earlier. It's certainly a typo for "Sundanese" and should be corrected.
Comment 21•21 years ago
|
||
Hi, this bug is important for me. We won't be able to translate Google into Aragonese since IE or Mozilla include this language (their rules). Thx.
Comment 22•21 years ago
|
||
Mozilla already includes Aragonese (language code 'an'), it's just not visible in the dialog by default, though you can still enter it manually. If you want it to be visible in the dialog, please file a separate bug.
Comment 23•21 years ago
|
||
Hi, I'm the localization contributor for Sorbian. I accidentially found this bug and stated that since 2003-09-01 the language code wen that I used till now was changed (and splitted) to dsb (for Lower Sorbian) and hsb (Upper Sorbian). And with Mozilla 1.6b I've got a problem. I can't create a profile directly for Sorbian. I have to switch by Edit-->Preferences-->Apeearancde-->Languages/Content. Since Mozilla 1.6b there isn't more an entry in the file res/languages.properties. Till Mozilla 1.6a there was an entry "sb.accept=true". Is this missing line the reason that I can't create a Sorbian profile directly? Maybe there is a problem in my language pack /http://www.sorbzilla.de/lanwende.xpi).
Comment 24•21 years ago
|
||
Michael, 'sb' was removed in bug 224546 because I couldn't find any trace of it having been ever defined in the official ISO 639 site (http://www.loc.gov/standards/iso639-2/). 'hsb', 'dsb' and 'wen' are defined for High Sorbian, Low Sorbian and 'Sorbian Lanugages' in ISO 639-2, but I couldn't find 'sb' at http://www.loc.gov/standards/iso639-2/codechanges.html So, I guess the fix is to add 'wen.accept=true' line unless there are two separate language packs for High Sorbian and Low Sorbian. Can you file a bug on that (that is off-topic here) and assign it to me?
Comment 25•20 years ago
|
||
Punjabi is now the preferred method of writing Punjabi/Panjabi. Panjabi is actually the correct transliteration (if you take the inherit vowel as being an 'a') but because of the way it is pronounced in Punjabi, the vowel used is actually somewhere inbetween 'a', 'e' and 'u'. :D Thus, for English speakers, the letter 'u' is the most appropriate.
Comment 27•18 years ago
|
||
*** Bug 353278 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 28•18 years ago
|
||
> I'm hoping we should be able to classify the remaining differences into one of > four categories: 1. spelled wrong (will fix), 2. foreign-language name rather > than English-language name (will fix), 3. wrong for another reason (will fix), > 4. wontfix (for whatever reason). > > Alternatively, we could just decide which ones to fix, but this topic seems > particularly contentious (no surpise), so we should document why we are or > aren't changing things. OK, let's have a whack at least at category 1. Where possible I'll use English-language sources from the sites of government agencies or language committees. I'm also changing the summary and URL to reflect that the IANA registry is now the normative source for language codes (per RFC 4646). Code We say IANA says Source fo Faeroese Faroese http://www.fmn.fo/malnevndin/about.htm ik Inupiak Inupiaq http://www.uaf.edu/anlc/langs/i.html sg Sangro Sango http://www.ethnologue.com/14/show_iso639.asp?code=sg su Sudanese Sundanese http://www.ethnologue.com/14/show_iso639.asp?code=su I think that only sg and su are actually "spelled wrong" in that list. The others are alternative spellings where the IANA spelling seems more normative. For the following, I can't find a source to prefer either of the two alternatives: Code We say IANA says lo Laothian Lao ps Pashto Pushto rn Kirundi Rundi si Singhalese Sinhalese ss Siswati Swati
Summary: Bring languageNames.properties up to date with iso639-2 → Bring languageNames.properties up to date with IANA registry
Assignee | ||
Comment 29•18 years ago
|
||
(In reply to comment #28) > For the following, I can't find a source to prefer either of the two > alternatives: Add to this list: Code We say IANA says rm Rhaeto-Romanic Raeto-Romance Category 2 (fix): Code We say IANA says dz Bhutani Dzongkha km Cambodian Khmer These both seem to be the other way round from Persian/Farsi: IANA is using a native name and we are using an English name. In both cases as far as I can tell the native name is used by native speakers when writing in English. See http://www.education.gov.bt/Departments/DDA/DDA.htm and http://www.mot.gov.kh/learn_khmer.asp
Assignee | ||
Comment 30•18 years ago
|
||
Attachment #242623 -
Flags: review?(jshin1987)
Comment 31•18 years ago
|
||
Comment on attachment 242623 [details] [diff] [review] Patch with the changes from the last few comments r=jshin sorry for the delay
Attachment #242623 -
Flags: review?(jshin1987) → review+
Assignee | ||
Comment 32•18 years ago
|
||
Checked in and closing bug. Future work will be done in bug 356038.
Status: REOPENED → RESOLVED
Closed: 21 years ago → 18 years ago
Resolution: --- → FIXED
Updated•18 years ago
|
Flags: in-testsuite-
Comment 33•12 years ago
|
||
test
You need to log in
before you can comment on or make changes to this bug.
Description
•