Open
Bug 1370185
Opened 7 years ago
Updated 2 years ago
Sorting Tibetan script (Tibetan or Dzongkha language)
Categories
(Core :: JavaScript: Internationalization API, defect, P5)
Tracking
()
UNCONFIRMED
People
(Reporter: elie.roux, Unassigned)
Details
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0 Build ID: 20170419042421 Steps to reproduce: 1. Install the dz_BT and bo_IN locales (I'm under Debian Sid) 2. Run the following code: var tibCollator = new Intl.Collator('dz'); var tibSortedArray = ["ང", "རྔ", "ལྔ", "སྔ", "བརྔ", "བསྔ", "ཅ"]; var tibRandomArray = ["ལྔ", "ང", "ཅ", "རྔ", "སྔ", "བརྔ", "བསྔ"]; var tibResultArray = ["ལྔ", "ང", "ཅ", "རྔ", "སྔ", "བརྔ", "བསྔ"]; tibResultArray.sort(tibCollator.compare); var resPrint = ""; if (JSON.stringify(tibSortedArray)==JSON.stringify(tibResultArray)) { resPrint = "ok!"; } else { resPrint = "error: "+JSON.stringify(tibRandomArray)+" has been sorted as "+JSON.stringify(tibResultArray)+", should have been "+JSON.stringify(tibSortedArray); } console.log(resPrint); Actual results: See first the expected: dz, bo, dz-BT, bo-IN then the unexpected error: ["ལྔ","ང","ཅ","རྔ","སྔ","བརྔ","བསྔ"] has been sorted as ["ང","ཅ","བརྔ","བསྔ","རྔ","ལྔ","སྔ"], should have been ["ང","རྔ","ལྔ","སྔ","བརྔ","བསྔ","ཅ"] Expected results: the array of tibetan strings should have been correctly sorted. Dzongka sorting data is in CLDR files for a long time, and is present in the GNU Glibc, as one can see in /usr/share/i18n/locales/dz_BT, so I have no idea why this doesn't work. Note that it also does not work in Chrome, but I think the reason is quite different: https://bugs.chromium.org/p/chromium/issues/detail?id=729508
Updated•7 years ago
|
Component: JavaScript Engine → JavaScript: Internationalization API
Comment 1•7 years ago
|
||
I think I've identified the root cause of this bug: We're using ICU to implement the Intl.Collator object, and it seems like ICU is returning inconsistent data about the supported collation types. ICU claims it supports "dz" (per ucol_getAvailable), but when we construct the UCollator object, the actual locale is the root locale. (I still need to verify this for ICU4C, but at least that's the case for ICU4J.) This bug can also reproduced for the locales "bo" (which imports the collation rules from "dz") and "wae", and also for the collation "de-u-co-eor" (per ucol_getKeywordValuesForLocale, "de" supports "eor", but the actual collator uses "und-u-co-eor"). "dz", "bo", "wae", and "de-u-co-eor" all have in common that their status is either draft="unconfirmed" or draft="provisional" (http://cldr.unicode.org/index/process#resolution_procedure). So it seems like we should file a bug at ICU's bug tracker...
Thanks a lot for the rapid answer! I've opened a ticket: http://bugs.icu-project.org/trac/ticket/13224 When/if you have a small example showing the bug can you upload it on the ICU ticket?
Comment 3•7 years ago
|
||
(In reply to Elie Roux from comment #2) > Thanks a lot for the rapid answer! I've opened a ticket: > http://bugs.icu-project.org/trac/ticket/13224 > > When/if you have a small example showing the bug can you upload it on the > ICU ticket? Thank you for reporting this issue! I've added a simple test case to the ICU ticket, hopefully this helps to determine what needs to be changed to get this issue resolved.
Updated•6 years ago
|
Priority: -- → P5
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•